Unlock a world of possibilities! Login now and discover the exclusive benefits awaiting you.
I'm attempting to select some generate a large amount of data automatically.
I have a table from which I would like to select some random data, populating into another table.
Example existing table (populated with data - let's say 1000 rows)
Person:
PersonID
FirstLast
I have another table which I would like to populate some data randomly from the Person: table. Ex:
Animal:
AnimalID
AnimalName
FavoritePerson
LeastFavoritePerson
I've tried a variety of methods, but seem to be stuck in getting it to work correctly.
My thinking of late has been:
"Pick a random number, use that as a recordID to select a random person - then do it again"
LOAD
Animal2:
Load
round(rand()* (1000-1)+1) as PersonID,
FirstLast as FavoritePerson
Resident Person;
LOAD
Animal3:
Load
round(rand()* (1000-1)+1) as PersonID,
FirstLast as LeastFavoritePerson
Resident Person;
NoConcatenate
join(Animal)
Load *
Resident Animal2;
join(Animal)
Load *
Resident Animal3;
Drop Tables Animal2, Animal3;
The random number feature works fine - I don't really care if the 2 fields end up as the same person (though with 1M rows, unlikely)
The problem is, I end up with many more records in the Animals table than expected using this method. And yes, the data needs to be joined/concatenated into the existing Animal table.
Maybe there's a better way to do it / think about it?
Thanks in advance!
In this case I would do the following:
MapPerson:
mapping load PersonID, FirstLast from PersonSource;
Animal:
load
AnimalID, AnimalName,
applymap('MapPerson', ceil(rand()*1000), '#NV') as FavoritePerson,
applymap('MapPerson', ceil(rand()*1000), '#NV') as LeastFavoritePerson
from AnimalSource;
You may need to adjust it to the available number of persons and maybe some further logic to avoid that Favorite and LeastFavorite might the same Person (I doubt that it will happens quite often but it could and if it's important and you are sure it couldn't happens in the real scenarios ...).
- Marcus
I think it would be helpful if you describe what the aim of this extending/reducing (?) of the Animal table is? And also a few sample records of each table and how it should be look like after the transforming.
- Marcus
The 'aim' - to generate data automatically for various different testing purposes.
Sample records:
Person:
PersonID FirstLast
1 JohnSnow
2 JimSummer
3 JoshFall
4 JerryWinter
5 SallySpring
Intended goal:
Animal:
AnimalID AnimalName FavoritePerson LeastFavoritePerson
1 Boots JoshFall JerryWinter
2 Sparky SallySpring JohnSnow
3 Malto SallySpring JimSummer ... ...
With an unlimited supply of PersonIDs, I'm attempting to automatically generate the last 2 columns' data in the Animal table from data from the Person table.
In this case I would do the following:
MapPerson:
mapping load PersonID, FirstLast from PersonSource;
Animal:
load
AnimalID, AnimalName,
applymap('MapPerson', ceil(rand()*1000), '#NV') as FavoritePerson,
applymap('MapPerson', ceil(rand()*1000), '#NV') as LeastFavoritePerson
from AnimalSource;
You may need to adjust it to the available number of persons and maybe some further logic to avoid that Favorite and LeastFavorite might the same Person (I doubt that it will happens quite often but it could and if it's important and you are sure it couldn't happens in the real scenarios ...).
- Marcus
Thanks Marcus,
That did work. I ended up going a bit different route to make it less random, but your code was sound and functional.
Thanks for the reply.