Unlock a world of possibilities! Login now and discover the exclusive benefits awaiting you.
Hi,
I am a DE from PT company. We are currently having a POC with your product.
While building a data flow I am struggling with loading multiple files from S3 folder to local folder.
Please see the screenshot below for more information.
Everything till last step(tS3Get_1) is working fine, but I don’t know how to get S3 path to the file dynamically.
Attaching WORKSHOP.zip archived file which contains job exported.
OK, first of all I have removed your attachment in the post. You had your keys in that job and that is not particularly safe at all. Be careful about sharing data like that.
I have looked at the job (but don't want to run it with your data) and can see what you are trying to achieve I think. But I am curious as to what happens when you run it? Do you have screenshots, errors, etc, that you could share? If you know what paths you are expecting to dynamically generate, can you add a tLogRow to output the paths you are getting and compare those against what you are expecting?
Here is the the screenshot with the output
1 - with the output from the first screenshot
2 - I need to make second one working with converting back to FlowtoIterate and Get those two files to local
Your tS3Get_1 looks OK to me....but without running, it is quite difficult to debug. What is not working when you run it completed? Is there an error? If you limit the file retrieval to a single file and hardcode the filename in the tS3Get_1, do you get a file? This would test your bucket and key? If yes, are you able to see the files you are wishing to retrieve in your workspace folder?
If you add your tFlowToIterate back and add a tFixedFlowInput to that, add the calculated values you have in tS3Get_1, then output that to a tLogRow, do the values look OK?
It's downloading only one file. It's working when I connect S3List to S3Get, but the issue is when I want to add a Filter component, I don't know how to use a variable from FlowToIterate inside Key attribute for S3Get
Oh I see. First of all, it is probably easier for you to tick the "Use the default (key, value) in global variables" tick box. That will give you the globalMap key of {row name}.{column name}. Your row name will be "row2" if it is coming from the tFilterRow_2 component. The column name will be whatever column you wish to use in the subsequent components. So in your job you have "bucket_file_name".
It should be pointed out that you are not passing the key name via the tFilterRow_2. You may want to do this and follow the same rule as above.
I am not sure I understood you
Should I use bucket_file_name in S3Get Component?
I am actually passing it, it's just called bucket_file_name
It should be pointed out that you are not passing the key name via the tFilterRow_2.
If you take a look at the parameters that are required by the tS3Get_1 component, you need to supply all of them. Sorry, I saw the name of your column and didn't check the globalMap. You are not passing the bucket.
So, what should I do with the params for S3Get?
Do you have an existing example of a job with FlowToiterate and then S3Get component working?
A tFlowToIterate simply adds each column in the row being iterated to the globalMap. So, for your column "bucket_file_name", you would retrieve the value with code like this....
((String)globalMap.get("row2.bucket_file_name"))
The first part ("((String)") casts the value to a String. The globalMap stores all values as Objects. The "key" automatically used is what I was describing with this....
{row name}.{column name}