Skip to main content
Announcements
A fresh, new look for the Data Integration & Quality forums and navigation! Read more about what's changed.
cancel
Showing results for 
Search instead for 
Did you mean: 
Anonymous
Not applicable

Generate Sample Data from Excel Input

Hello folks,
I'm new to TOS and need some feedback.
I have two excel sheets which look like shown in the provided images.
Now I want to join them more or les randomly. Aircrafts which have an x in the long_haul_ac column should be joined with 1-2 flights from the flight plan. Aircrafts which do not have the x should be joined with 4-6 (short haul) flights from the flight plans.
So in the end there should be several rows containing the same ac_reg but different information from flight plan.
I started by separting long and short haul aircrafts and flights with the filter step. But then I don't know how to iterate over the airplanes. what i want is something like this:
foreach(airplane in airplanes)
if(airplane == shorthaul){
loop(Numeric.random(1,4)
join with random shorthaul flight
else
loop(Numeric.random(1,2)
join with random longhaul flight
But the forEach Component seems not to be sufficient for my problem.
Thanks in advance and kind regards
0683p000009MBN3.png 0683p000009MBN8.png
Labels (2)
6 Replies
Anonymous
Not applicable
Author

Hi,
I have two excel sheets which look like shown in the provided images.
Now I want to join them more or les randomly. Aircrafts which have an x in the long_haul_ac column should be joined with 1-2 flights from the flight plan. Aircrafts which do not have the x should be joined with 4-6 (short haul) flights from the flight plans.
So in the end there should be several rows containing the same ac_reg but different information from flight plan.

Could you please paste the expected result into forum so that we can get it more precisely.
Best regards
Sabrina
Anonymous
Not applicable
Author

Hi,
I have two excel sheets which look like shown in the provided images.
Now I want to join them more or les randomly. Aircrafts which have an x in the long_haul_ac column should be joined with 1-2 flights from the flight plan. Aircrafts which do not have the x should be joined with 4-6 (short haul) flights from the flight plans.
So in the end there should be several rows containing the same ac_reg but different information from flight plan.

Could you please paste the expected result into forum so that we can get it more precisely.
Best regards
Sabrina

Hi Sabrina,
sorry for my very late response but other projects forced me to delay my Talend Project 0683p000009MA9p.png
The uploaded Image will show how my expected result looks like.
As you can see, it consists of the informations from the both provided tables, randomly joined. The only constraint is, that longhaul flights are joined with longhaul planes.
The landing_date_time will be calculated in a tMap so that we will have some more or less realistic delays...
I also added what I tried until right now. But there are two problems now with my approach:
1. I can't connect "get_random_longhaul_flight" and the "map_lonhauls" step in order to join my filtered longhaul flights and plans.
2. From the filter step I only get 1 row. I somehow have to iterate over this step fpr several times?!
This Talend Job is supposed to be run daily, providing different output. The data should be the base for doing some example analytics.
Thanks & Regards
PS: The restrictions for images are ridicolous, sorry...
0683p000009MBND.jpg 0683p000009MBNI.jpg
Anonymous
Not applicable
Author

Ok, since I didn't manage to get the data the way I wanted from Excel, I choose another way.
I put possible values in a GenerateRow String fild as a list, where they are randomly picked from. Now I have one question: Is it possible that a value from such a list is just picked once? Like doing a pop from an array instead of referencing a field of it? I know I could write some custom code but I want to stick to standard functions if possible.
Saying I have the following values for a String field:
"ED100","ED101","ED110","ED111","ED120","ED121","ED130"
In a generate Row Step, I want that one of the values is picked from that list randomly, but only once!!! For the next row, it must be given that the same value can not be choosen again!
Thanks & Regards
talendtester
Creator III

Make four inputs:
1. Write the shorthaul flight times to a delimited file and then use tFileInputFullRow to get one at a time.
2. Write the longhaul flight times to a delimited file and then use tFileInputFullRow to get one at a time.
3. Write shorthaul planes to a delimited file and then use tFileInputFullRow to get one at a time.
4. Write longhaul planes to a delimited file and then use tFileInputFullRow to get one at a time.
This ensures nothing is used more than a single time.
Anonymous
Not applicable
Author

Make four inputs:
1. Write the shorthaul flight times to a delimited file and then use tFileInputFullRow to get one at a time.
2. Write the longhaul flight times to a delimited file and then use tFileInputFullRow to get one at a time.
3. Write shorthaul planes to a delimited file and then use tFileInputFullRow to get one at a time.
4. Write longhaul planes to a delimited file and then use tFileInputFullRow to get one at a time.
This ensures nothing is used more than a single time.

Hi and thanks for your reply. The problem I see is, that this apporach does not include any randomness. Do you have an idea how to choose a random plane from the delimited file? Else my sample data will look to uniform...
Kind regards
talendtester
Creator III