Solved: Talend Spark job . Join using tmap Issue - Qlik Community

badri-nair · ‎2019-04-01

Hi,

I have created a simple BD spark job. I have used tmap where I am looking up to a few files.

The compilation is fine, The lookup hdfs files have contents in them.

But the Code fails during execution: Error message below.

If any one has faced similar issue, please let me know how was it resolved.

org.talend.bigdata.dataflow.SpecException: Invalid input accessor: clsn_62.null
at org.talend.bigdata.dataflow.hmap.HMapSpec$JoinDef.deserialize(HMapSpec.java:1206)
at org.talend.bigdata.dataflow.hmap.HMapSpec$JoinDef.access$1800(HMapSpec.java:1122)
at org.talend.bigdata.dataflow.hmap.HMapSpec.joinKey(HMapSpec.java:563)
at org.talend.bigdata.dataflow.hmap.HMapSpecBuilder.joinKey(HMapSpecBuilder.java:171)
at t_data_wh.tdata_wh_bd_spec_0_1.tdata_wh_BD_spec.tHiveInput_2Process(tdata_wh_BD_spec.java:2557)
at t_data_wh.tdata_wh_bd_spec_0_1.tdata_wh_BD_spec.tFileInputDelimited_6_HDFSInputFormatProcess(tdata_wh_BD_spec.java:4458)
at t_data_wh.tdata_wh_bd_spec_0_1.tdata_wh_BD_spec.run(tdata_wh_BD_spec.java:4856)
at t_data_wh.tdata_wh_bd_spec_0_1.tdata_wh_BD_spec.runJobInTOS(tdata_wh_BD_spec.java:4672)
at t_data_wh.tdata_wh_bd_spec_0_1.tdata_wh_BD_spec.main(tdata_wh_BD_spec.java:4554)
org.talend.bigdata.dataflow.SpecException: Invalid input accessor: clsn_62.null
at org.talend.bigdata.dataflow.hmap.HMapSpec$JoinDef.deserialize(HMapSpec.java:1206)
at org.talend.bigdata.dataflow.hmap.HMapSpec$JoinDef.access$1800(HMapSpec.java:1122)
at org.talend.bigdata.dataflow.hmap.HMapSpec.joinKey(HMapSpec.java:563)
at org.talend.bigdata.dataflow.hmap.HMapSpecBuilder.joinKey(HMapSpecBuilder.java:171)
at t_data_wh.tdata_wh_bd_spec_0_1.tdata_wh_BD_spec.tHiveInput_2Process(tdata_wh_BD_spec.java:2557)
at t_data_wh.tdata_wh_bd_spec_0_1.tdata_wh_BD_spec.tFileInputDelimited_6_HDFSInputFormatProcess(tdata_wh_BD_spec.java:4458)
at t_data_wh.tdata_wh_bd_spec_0_1.tdata_wh_BD_spec.run(tdata_wh_BD_spec.java:4856)
at t_data_wh.tdata_wh_bd_spec_0_1.tdata_wh_BD_spec.runJobInTOS(tdata_wh_BD_spec.java:4672)
at t_data_wh.tdata_wh_bd_spec_0_1.tdata_wh_BD_spec.main(tdata_wh_BD_spec.java:4554)
[ERROR]: t_data_wh.tdata_wh_bd_spec_0_1.tdata_wh_BD_spec - TalendJob: 'tdata_wh_BD_spec' - Failed with exit code: 1.

manodwhb · ‎2019-04-02

@badri-nair ,check below link,

1- Make sure the input for this flow ( row1 ) here 0 is not empty, this input can be a source file or a db query

2- make sure you are not joining on multiple keys, if you need to do so , please one tmap for each key join

http://talendexpert.com/talend-spark-error-2/

View solution in original post

manodwhb · ‎2019-04-02

@badri-nair ,check below link,

1- Make sure the input for this flow ( row1 ) here 0 is not empty, this input can be a source file or a db query

2- make sure you are not joining on multiple keys, if you need to do so , please one tmap for each key join

http://talendexpert.com/talend-spark-error-2/

badri-nair · ‎2019-04-02

Thank you very much, Manohar.

had to use 4 tmaps for 4 joins. I had in just one tmap in the previous standard job. But it does work.

Thanks

Badri Nair

manodwhb · ‎2019-04-02

@badri-nair ,can you show your job design and you can first load to tHashoutput and read that using tHashInput ,if you want to take multiple lockups as you can use multiple tHasInputs

badri-nair · ‎2019-04-02

HI Manohar,

there is no thash components for a spark job. only tCache .

In the standard job i had 4 thash outputs and 4thash inputs was linked to each one of them. They 4 inputs were used as lookup in a single tmap.

Looks like that cant be done in a spark job .

Thanks

Badri Nair

STD-job.JPG

Talend Spark job . Join using tmap Issue

Big Data

Java

v7.x