Unlock a world of possibilities! Login now and discover the exclusive benefits awaiting you.
I have been writing to s3 from a tRest component, and it's been awkward but effective. I have to call a tRest component in a Standard job, then make a parquet file and call a Big Data Batch job to move that file to s3.
Now I'd like to be able to pull from s3 in a Big Data Batch job (tDeltaLakeInput) and write to a json file.
I can pull data and write to a tLogRow and that works.
But the tFileOutputJSON in Big Data Batch is different from the same component in the Standard job. The job will create a folder with some files ("part-00000", "_SUCCESS", ".part-00000.crc" and "._SUCCESS.crc"), 1 of which appears to be json format.
How can I get data from s3, in a more useable manner, into Talend for other jobs to use.?
Hello @pthomas
I don't use the Big Data Batch version, but the Open Studio version has native S3 components, so I don't really understand the usefulness of tRest for integrating with S3.
In any case, with a design like the one below, I think you could achieve the desired result.
Best Regards
Hello @pthomas
I don't use the Big Data Batch version, but the Open Studio version has native S3 components, so I don't really understand the usefulness of tRest for integrating with S3.
In any case, with a design like the one below, I think you could achieve the desired result.
Best Regards
I asked why we weren't using the s3 controls in the Standard jobs but got a generic answer. I'll dig in deeper with them.
Thanks.