Do not input private or sensitive data. View Qlik Privacy & Cookie Policy.
Skip to main content

Announcements
Qlik Open Lakehouse is Now Generally Available! Discover the key highlights and partner resources here.
cancel
Showing results for 
Search instead for 
Did you mean: 
Bluemoon
Creator
Creator

How to load data into HDFS using Spark streaming or Batch job with reference to data

Hi All,

 

Usecase:

 

I have a data in coming form a file.

For example

RawDataA, A

RawDataB, B

RawDataC, C

.

.

RawDataZ, Z

 

Now I wanted to store "RawDataX" in corresponding X value location

/X/RawDataX

 

Note:

I don't want to create 26 tFileOutputDelimited in job

 

Is there any possible way where i can use single tFileOutputDelimited for all records

 

Heads up

In DI, we can use tFlowtoIterate and context variable in tFileOutputDelimited to generate above requirement

 

Can anyone give some ideas how to implement same thing in spark or map-reduce job ?

Labels (2)
1 Reply
Anonymous
Not applicable

Hello,

So far, tflowtoIterate is available in Standard ETL only.  

Here is a KB article about:https://community.talend.com/t5/Architecture-Best-Practices-and/Spark-Dynamic-Context/ta-p/33038.

Hope it will help.

Best regards

Sabrina