Solved: Re: tFileArchive gzip files can't be read in S3 - Qlik Community

MattE · ‎2021-04-06

Hi,

Long shot.

I've got a job which pulls from an API. If i write the JSON response to a file using tFileOutputDelimited and upload to an S3 bucket via tS3Put, i can read the resulting data in Redshift via Spectrum. It works, however, i need to compress these files to save space.

If i add an extra step and use tFileArchive on the JSON file and upload it to S3, Redshift is unable to read the data with the error below.

SQL Error [500310] [XX000]: [Amazon](500310) Invalid operation: Spectrum Scan Error

Details: Spectrum Scan Error code: 15001 context: Error while reading next ION/JSON value: IERR_INVALID_TOKEN. In file 'https://s3.eu-west-1.amazonaw..... [Line 1, Pos 1] query: 2178747 location: dory_util.cpp:1081

process: fetchtask_thread [pid=27919]

As mentioned,

the raw JSON file can be uploaded and read

If i use 7ZIP to create a .gz archive fromt his file and upload via tS3Put the data can be read

The problem only occurs when i introduce tFileArchive into the mix. I've tried both tar.gz and gzip as the archive format in tFileArchive.

MattE · ‎2021-04-06

Ok looks like the issue was actually in the config for tS3Put. In the key section i wasn't adding ".gz" onto the end so S3 was seeing it as .json but then couldn't read the gzipped contents. Now fixed

View solution in original post

MattE · ‎2021-04-06

Ok looks like the issue was actually in the config for tS3Put. In the key section i wasn't adding ".gz" onto the end so S3 was seeing it as .json but then couldn't read the gzipped contents. Now fixed

Anonymous · ‎2021-04-08

Hello,

Thanks for letting us know this issue has been resolved by yourself.

Best regards

Sabrina

tFileArchive gzip files can't be read in S3

Other

Talend Data Integration

v7.x