Unlock a world of possibilities! Login now and discover the exclusive benefits awaiting you.
we are trying to use talend batch (spark) jobs to access hive in a Kerberos cluster but we are getting the below "Can't get Master Kerberos principal for use as renewer" error.
By using the standard jobs(non spark) in talend we are able to access hive without any issue.
Sample Batch Job:
Below are the observation:
I am not sure exactly what is the issue which is causing the token problem. could some one help us know the root cause.
One more thing to add instead of hive if I read / write to hdfs using spark batch jobs it works , So only problem is with hive and Kerberos.
it contains all the site xml's which is required to connect to cluster. Not a key value property. you need to build a proper maven jar and add all the hadoop files.
In your talend job add the jar using tLibrary which will add all the config files when its build and deployed. We haven't passed any extraJavaOptions parameter.
unfortunately this didn't solve our case. i generated a simple jar file with all xmls:
- core-site.xml
- hdfs-site.xml
- hive-site.xml
- krb5.conf
- mapred-site.xml
- tez-site.xml
- yarn-site.xml
seems that the files was token into account, because i had to delete the python property in core-site.xml that generated an error.
thank you any way