Do not input private or sensitive data. View Qlik Privacy & Cookie Policy.
Skip to main content

Announcements
Qlik Open Lakehouse is Now Generally Available! Discover the key highlights and partner resources here.
cancel
Showing results for 
Search instead for 
Did you mean: 
Bruce_Perez
Contributor
Contributor

How to Read Avro/Parquet Files in AWS S3

I am trying to move data from AWS S3 with flat files(avro or parquet) in Talend. What component shall I use to read/extract it? Im using Talend DI v8. I've tried tFileInputDelimited but only reads csv or txt. Not sure if Im configuring it correctly.

Labels (3)
5 Replies
Anonymous
Not applicable

Hello,

You can use tAvroInput component to read Avro format file, see https://help.talend.com/r/en-US/8.0/avro/tavroinput

use tFileInputParquet component to read Parquet format file, see https://help.talend.com/r/en-US/8.0/parquet/tfileinputparquet

 

if you can't find the components in studio, please install them by feature manager

see https://help.talend.com/r/en-US/8.0/studio-user-guide/install-features-to-talend-studio

 

Best regards

Aiming

 

Bruce_Perez
Contributor
Contributor
Author

Thank you very much for your insights. Follow up question, is this feature available only in enterprise version, or can I install in free version?

Anonymous
Not applicable

hello @Bruce Perez​ ,

Unfortunately, this feature is only available in enterprise version. thanks

Bruce_Perez
Contributor
Contributor
Author

Are there any alterntative component which I can use to read these files? For Avro, I have tried using tFileInputJSON, not sure with Parquet files. Again, I really appreciate your help on this.

rhall1
Contributor III
Contributor III

You *may* be able to extrapolate from a blog a wrote a long time ago. Unfortunately it is no longer "live" but you can see it with "Wayback Machine". Here is a link....

 

http://web.archive.org/web/20200919171507/https://www.talend.com/blog/2019/06/12/talend-pipeline-designer-avro-schemas/

 

I put some code together to serialise and deserialise JSON data to and from AVRO. As I said, it is not exactly what you need, but you may be able to find a solution from this.