Do not input private or sensitive data. View Qlik Privacy & Cookie Policy.
Skip to main content

Announcements
Qlik and ServiceNow Partner to Bring Trusted Enterprise Context into AI-Powered Workflows. Learn More!
cancel
Showing results for 
Search instead for 
Did you mean: 
Anonymous
Not applicable

tHDFSInput ArrayIndexOutOfBoundsException in MapReduce job

Hi,
I am trying to run the below Talend Map Reduce Job that reads a file from HDFS and load it into HDFS again after minor changes.
tHDFS_Input -> tMap -> tHDFS_Output
The schema of the input component is: field1 -string, field2-string,field3-string. I set the Row separator to "\n" and the Field separator to "," and all fields are Nullable. Everything works as expected if the input file contains rows like (with null value for all columns except the last one):
text11,text12,text13
,text22,text23
text31,,text33
but fails if the input file contains a row that has a null value for the last field (text41,text42,) with the following error:
Task Id : attempt_1429539242538_61552_m_000001_0, Status : FAILED
Error: java.lang.ArrayIndexOutOfBoundsException: 2
    at sp5.testinput_0_1.testInput$row1StructInputFormat$HDFSRecordReader.next(testInput.java:337)
    at sp5.testinput_0_1.testInput$row1StructInputFormat$HDFSRecordReader.next(testInput.java:1)
    at org.apache.hadoop.mapred.MapTask$TrackedRecordReader.moveToNext(MapTask.java:198)
    at org.apache.hadoop.mapred.MapTask$TrackedRecordReader.next(MapTask.java:184)
    at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:52)
    at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:430)
    at org.apache.hadoop.mapred.MapTask.run(MapTask.java:342)
    at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:167)
    at java.security.AccessController.doPrivileged(Native Method)
    at javax.security.auth.Subject.doAs(Subject.java:415)
    at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1557)
    at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:162)
Is this the expected behavior? I tried the tFileInputDelimited component and it is able to detect that the last field is null.
I use Talend Platform for Data Services with Big Data Version: 5.6.1.
Thanks,
Anca
Labels (3)
4 Replies
Anonymous
Not applicable
Author

Any help would be appreciated!
Anonymous
Not applicable
Author

Can anyone help on the above issue please?
Anonymous
Not applicable
Author

is that a bug or did I miss something?
any updates?
Anonymous
Not applicable
Author

I saw that there is a ticket on jira  for this: so this seems to be a bug.