Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 
Dalton_Ruer
Support
Support

So you need to "fake" some data huh

Well just because you asked so nicely and because I value you so much ... I was willing to set aside my 30 year career of ensuring a "Single Source of the Truth" and put this document together to help you take your good data and turn it into a source of random truth.

I may never be trusted again in the industry, but if it helps you manipulate and de-identify your data so that you can demonstrate your applications to others more freely, well then so be it.

2 Replies
Dalton_Ruer
Support
Support
Author

As timing would have it I needed to generate some fake data this morning and came across another tip.

I wanted to generate random smokers. But is the population of smokers truly random? I did a google search and discovered that about 18% of the population are smokers. So I used my Random Fractional Value and simply added a clause that if the value was over .82 then yes they were a smoker. My results yielded about 18% as smokers.

IF (RandomNumber > .82, 'Yes', 'No') as Smoker,

Again your imagination is the only barrier to what you can generate in terms of fake data. If you have parameters to fit within, then force it to fit within those parameters.

BTW - My fake ages were all above 10, otherwise I probably would have added more logic to my IF so that I didn't generate infants that were smokers. Even when faking data I want to be as close to the truth as I can be.

Anonymous
Not applicable

This is timely...The Bayesian Trap - YouTube