Skip to main content
Announcements
A fresh, new look for the Data Integration & Quality forums and navigation! Read more about what's changed.
cancel
Showing results for 
Search instead for 
Did you mean: 
Anonymous
Not applicable

Missing winutils.exe (Failed to locate the winutils binary in the hadoop binary path)

Hi,

 

I am using Talend Open Studio for Big Data 6.4.1 on Windows 10 with Hadoop v 2.0 on the cloud (provided by IBM demo cloud). My job (joined as image) output is :

[WARN ]: org.apache.hadoop.util.NativeCodeLoader - Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
[ERROR]: org.apache.hadoop.util.Shell - Failed to locate the winutils binary in the hadoop binary path
java.io.IOException: Could not locate executable null\bin\winutils.exe in the Hadoop binaries.
at org.apache.hadoop.util.Shell.getQualifiedBinPath(Shell.java:378)
at org.apache.hadoop.util.Shell.getWinUtilsPath(Shell.java:393)
at org.apache.hadoop.util.Shell.<clinit>(Shell.java:386)
at org.apache.hadoop.util.StringUtils.<clinit>(StringUtils.java:79)
at org.apache.hadoop.security.Groups.parseStaticMapping(Groups.java:130)
at org.apache.hadoop.security.Groups.<init>(Groups.java:94)
at org.apache.hadoop.security.Groups.<init>(Groups.java:74)
at org.apache.hadoop.security.Groups.getUserToGroupsMappingService(Groups.java:303)
at org.apache.hadoop.security.UserGroupInformation.initialize(UserGroupInformation.java:283)
at org.apache.hadoop.security.UserGroupInformation.ensureInitialized(UserGroupInformation.java:260)
at org.apache.hadoop.security.UserGroupInformation.loginUserFromSubject(UserGroupInformation.java:804)
at org.apache.hadoop.security.UserGroupInformation.getLoginUser(UserGroupInformation.java:774)
at org.apache.hadoop.security.UserGroupInformation.getCurrentUser(UserGroupInformation.java:647)
at org.apache.hadoop.fs.FileSystem$Cache$Key.<init>(FileSystem.java:2806)
at org.apache.hadoop.fs.FileSystem$Cache$Key.<init>(FileSystem.java:2798)
at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:2661)
at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:379)
at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:178)
at org.apache.pig.backend.hadoop.datastorage.HDataStorage.init(HDataStorage.java:72)
at org.apache.pig.backend.hadoop.datastorage.HDataStorage.<init>(HDataStorage.java:58)
at org.apache.pig.backend.hadoop.executionengine.HExecutionEngine.init(HExecutionEngine.java:215)
at org.apache.pig.backend.hadoop.executionengine.HExecutionEngine.init(HExecutionEngine.java:122)
at org.apache.pig.impl.PigContext.connect(PigContext.java:301)
at org.apache.pig.PigServer.<init>(PigServer.java:220)
at org.apache.pig.PigServer.<init>(PigServer.java:205)
at local_project.aggregate_movie_director_0_1.aggregate_movie_director.tPigLoad_1Process(aggregate_movie_director.java:1330)
at local_project.aggregate_movie_director_0_1.aggregate_movie_director.runJobInTOS(aggregate_movie_director.java:2474)
at local_project.aggregate_movie_director_0_1.aggregate_movie_director.main(aggregate_movie_director.java:2323)
[WARN ]: org.apache.pig.PigServer - Empty string specified for jar path

 

I looked at solutions at:

https://community.talend.com/t5/Sandbox/ERROR-org-apache-hadoop-util-Shell-Failed-to-locate-the-winu...

https://stackoverflow.com/questions/19620642/failed-to-locate-the-winutils-binary-in-the-hadoop-bina...

https://community.talend.com/t5/Sandbox/Missing-winutils-exe-Failed-to-locate-the-winutils-binary-in...

 

but it seems that the missing winutils.exe must be placed in the hadoop home directory. I didn't install the hadoop cluster myself, since this is a cloud 'as a service' cluster. Would you have a suggestion to help ?

 

Best regards, Sélim

 

 

Labels (4)
5 Replies
Anonymous
Not applicable
Author

Hi again,

 

I followed the steps :

"specify the Hadoop home directory that contains the winutils.exe program

  • If you don't have a local Hadoop install on Windows you can download winutils.exeand then:
    • create a Hadoop home directory
    • place winutils.exe in a bin directory under that Hadoop home directory
  • use a system property -Dhadoop.home.dir to point to the Hadoop home directory when you start the Java process. An example:
    java -D"hadoop.home.dir=C:\Users\<username>\Hadoop" -jar my.jar"

I dowloaded hadoop-common-2-2-0.bin-master containing winutils.exe, and placed it in a new directory  "C:\hadoop_home\hadoop-common-2.2.0-bin-master\bin". I then set the VM argument -D"hadoop.home.dir=C:\hadoop_home\hadoop-common-2.2.0-bin-master" in Advanced parameter->JVM setting of the job (see attachment talendhadooppig2). It seems to work since I don't have the java exception "java.io.IOException: Could not locate executable null\bin\winutils.exe in the Hadoop binaries." anymore. Nevertheless I still got some warnings :

 

[WARN ]: org.apache.hadoop.util.NativeCodeLoader - Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
[WARN ]: org.apache.pig.PigServer - Empty string specified for jar path

 

My job should normally use Pig to import 2 tables from HDFS (1 main and 1 ref), do the lookup mapping, and export 2 tables to HDFS (results and rejects) in a new directory. It is running without ending, while the data flow ends on the design panel with only 1 row processed, and the output directory is not created (even erased if I create one...).

 

The warning are quite explicite but I don't know what to do with it... builtin-java classes are not always applicable ? how can I set the jar path for the PigServer ?

 

Thank you in advance,

Cordially, Sélim


talendhadooppig2.PNG
Anonymous
Not applicable
Author

Hello,

Have you already checked this online document about:TalendHelpCenter:The missing winutils.exe program in the Big Data Jobs?

Best regards

Sabrina

Anonymous
Not applicable
Author

Hi Sabrina,

 

Yes I checked this out and followed the steps. I don't have the issue anymore (neither in the job output, nor in the trace/java debug screen). But using the trace debugging, I see only 1 row with null values flowing through my tPigLoad components (please see attachment) ... Besides, I still have 2 warnings :

[WARN ]: org.apache.hadoop.util.NativeCodeLoader - Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
[WARN ]: org.apache.pig.PigServer - Empty string specified for jar path

 

Best regards, Sélim


talendhadooppig3.PNG
Anonymous
Not applicable
Author

By the way, to make it work I had to set in the JVM setting :

-Dhadoop.home.dir="C:\\hadoop_home\\hadoop-common-2.2.0-bin-master"

with doubled \\.

 

Maybe I should make this post resolved and create a new one ?...

Anonymous
Not applicable
Author

Hi,

The missing winutils.exe issue has been fixed on your end?

For your current issue, could you please show us the full stack trace? Do you want to extract data from HDFS and load it into Pig?

Best regards

Sabrina