Skip to main content
Announcements
NEW: Seamless Public Data Sharing with Qlik's New Anonymous Access Capability: TELL ME MORE!
cancel
Showing results for 
Search instead for 
Did you mean: 
WSyahirah21
Creator
Creator

tHDFSPut prompts error message "java.lang.NullPointerException: null"

I am trying to upload my json data to HDFS before storing it into Hive using talend data fabric. My workflow is:

 

**** > tmaps / tLogRows (structuring json data) > tHDFSPut > THiveCreateTable > tHiveLoad.

 

Workflow:

0695b00000SqArGAAV.png 

Result from tLogRow:

0695b00000SqArkAAF.png 

tHDFSPut configuration:

0695b00000SqAtqAAF.png 

I already managed to structure the data into a proper table (with schema). However once I tried to upload the data into HDFS using tHDFSPut component, it gives the error (see the detail error in attachment), as seems like I didnt fetch any data from tLogRow.. e.g. ((String)globalMap.get("tLogRow_1_CURRENT_FILEPATH") - no idea how to use this in this case) :

 

"java.lang.NullPointerException: null" 

 

My question is:

  1. Am I doing a correct way/component to load data into hdfs in talend?
  2. Any other recommendations of workflow, as long as I able to load data to hdfs .. before saving into Hive

 

Note:

  1. Reason I upload the data to hive, instead of straightaway load to hive is because inserting the data to hive without saving it into temp storage is slow (for streaming data)
  2. Reference : https://www.youtube.com/watch?v=W4xQGnC8sY4&t=55s

 

Labels (2)
1 Solution

Accepted Solutions
Anonymous
Not applicable

Check the 'Use Perl5 Regex...' box if you set a regular expression in FileMask, or write the full file name if there is only one file in the folder.

 

 

View solution in original post

8 Replies
WSyahirah21
Creator
Creator
Author

Another one trial was using THDFSOutput, where I sync the schema from tMap_1. but same error persists.

0695b00000SqBlsAAF.png

Anonymous
Not applicable

Hi

You are using an existing connection on HDFS component, make sure the connection is created before it is used. we usually create the connection in the beginning of job.

PreJob--oncomponentok--tHDFSConnection.

 

You don't use tHDFSPut correctly, this component load the local file to HDFS system, but I see the local folder field is empty. Change your job design as below:

PreJob--oncomponentok--tHDFSConnection

 ....tmaps / tLogRows (structuring json data)

|onsubjobok

tHDFSPut

|onsubjobok

THiveCreateTable --oncomponentok->tHiveLoad.

 

Regards

Shong

 

WSyahirah21
Creator
Creator
Author

Hi Shong,

 

Yes I actually already make the hdfs connection in the workflow.

And tried the suggested workflow you mentioned above.

 

0695b00000SqCaMAAV.png 

Setting in tHDFSPut:

0695b00000SqCZtAAN.pngHowever, I am quite confused how do I pass the result from tLogRow to tHDFSPut? because here I see only the local (where the source file is stored) and hdfs dir (destination dir to store file).

 

 

Anonymous
Not applicable

store the result to a local file using tFileOutputDelimited first, then select the local file on tHDFSPut.

 

WSyahirah21
Creator
Creator
Author

Hi Shong,

 

I already put the data into local in csv file. But when connecting to HDFSPut, I still receive following error:

 

java.lang.NullPointerException: null

0695b00000SqCkgAAF.png

Anonymous
Not applicable

why I see the 'Use an existing connection' box is unchecked? In Local directory field, set the path to directory path, not the file path. In Files table, set the New name.

WSyahirah21
Creator
Creator
Author

0695b00000SqCpRAAV.png 

The existing connection is checked (connected to tHDFSConnection - the feature my of studio attempts to highlight the layout of the box).

 

Already change the local dir and for new name, i will leave it blank as I dont want to use any new name and will just retain the existing name. However, it still gives the null value error. Did I miss any config in that component?

Anonymous
Not applicable

Check the 'Use Perl5 Regex...' box if you set a regular expression in FileMask, or write the full file name if there is only one file in the folder.