How to read or write parquet files present in adls via a spark batch job running in local mode in Talend

Ask a Question

Hello Folks,

I have been trying to read parquet files present in adls via a spark batch job (which runs on local mode) in Talend Big Data Platform.

Whenever I specify a location in tfileinputparquet (which is definitely present in adls), I get following error:

org.apache.spark.sql.AnalysisException: Path does not exist: file:/<adls_path>;

If i place the same parquet files at C/<adls_path> in my rdp, the job works fine, which means that the job is trying to read it from C drive rather than reading from adls.

However if I use tfileinputdelimited and specify filepath of a .csv file present in adls, the job reads the file present in adls and not from C drive of rdp.

I'm completely lost right now, and will be grateful if someone can help me with this issue.

TIA

0 Replies

How to read or write parquet files present in adls via a spark batch job running in local mode in Talend

Azure

Talend Big Data

v7.x