Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 
joey_lutes
Partner - Creator
Partner - Creator

Select Random Data from existing tables to populate new tables

I'm attempting to select some generate a large amount of data automatically.

I have a table from which I would like to select some random data, populating into another table.

Example existing table (populated with data - let's say 1000 rows)

Person:

     PersonID    

     FirstLast

    

I have another table which I would like to populate some data randomly from the Person: table.  Ex:

Animal:

     AnimalID

     AnimalName

     FavoritePerson

     LeastFavoritePerson

I've tried a variety of methods, but seem to be stuck in getting it to work correctly.

My thinking of late has been:

"Pick a random number, use that as a recordID to select a random person - then do it again"

LOAD

Animal2:

     Load

          round(rand()* (1000-1)+1) as PersonID,

          FirstLast as FavoritePerson

     Resident Person;

LOAD

Animal3:

     Load

          round(rand()* (1000-1)+1) as PersonID,

          FirstLast as LeastFavoritePerson

     Resident Person;

NoConcatenate

join(Animal)

     Load *

Resident Animal2;


join(Animal)

     Load *

Resident Animal3;

Drop Tables Animal2, Animal3;

The random number feature works fine - I don't really care if the 2 fields end up as the same person (though with 1M rows, unlikely)

The problem is, I end up with many more records in the Animals table than expected using this method.  And yes, the data needs to be joined/concatenated into the existing Animal table.

Maybe there's a better way to do it / think about it?

Thanks in advance!

1 Solution

Accepted Solutions
marcus_sommer

In this case I would do the following:

MapPerson:

mapping load PersonID, FirstLast from PersonSource;

Animal:

load

     AnimalID, AnimalName,

     applymap('MapPerson', ceil(rand()*1000), '#NV') as FavoritePerson,

     applymap('MapPerson', ceil(rand()*1000), '#NV') as LeastFavoritePerson

from AnimalSource;

You may need to adjust it to the available number of persons and maybe some further logic to avoid that Favorite and LeastFavorite might the same Person (I doubt that it will happens quite often but it could and if it's important and you are sure it couldn't happens in the real scenarios ...).

- Marcus

View solution in original post

4 Replies
marcus_sommer

I think it would be helpful if you describe what the aim of this extending/reducing (?) of the Animal table is? And also a few sample records of each table and how it should be look like after the transforming.

- Marcus

joey_lutes
Partner - Creator
Partner - Creator
Author

The 'aim' - to generate data automatically for various different testing purposes.

Sample records:

Person:

PersonID               FirstLast

1                         JohnSnow

2                         JimSummer

3                         JoshFall

4                         JerryWinter

5                         SallySpring

Intended goal:

Animal:

AnimalID          AnimalName     FavoritePerson              LeastFavoritePerson

1                             Boots               JoshFall                         JerryWinter

2                              Sparky           SallySpring                    JohnSnow

3                              Malto               SallySpring                    JimSummer                        ...                                     ...

With an unlimited supply of PersonIDs, I'm attempting to automatically generate the last 2 columns' data in the Animal table from data from the Person table.

marcus_sommer

In this case I would do the following:

MapPerson:

mapping load PersonID, FirstLast from PersonSource;

Animal:

load

     AnimalID, AnimalName,

     applymap('MapPerson', ceil(rand()*1000), '#NV') as FavoritePerson,

     applymap('MapPerson', ceil(rand()*1000), '#NV') as LeastFavoritePerson

from AnimalSource;

You may need to adjust it to the available number of persons and maybe some further logic to avoid that Favorite and LeastFavorite might the same Person (I doubt that it will happens quite often but it could and if it's important and you are sure it couldn't happens in the real scenarios ...).

- Marcus

joey_lutes
Partner - Creator
Partner - Creator
Author

Thanks Marcus,

That did work.  I ended up going a bit different route to make it less random, but your code was sound and functional.

Thanks for the reply.