Tips for finding Knowledge Articles

  • - Enter just a few key words related to your question or problem
  • - Add Key words to refine your search as necessary
  • - Do not use punctuation
  • - Search is not case sensitive
  • - Avoid non-descriptive filler words like "how", "the", "what", etc.
  • - If you do not find what you are looking for the first time,reduce the number of key words you enter and try searching again.
  • - Minimum supported Internet Explorer version is IE9
Home  >
article

KB-2588: How to integrate Hadoop with Centrify?

Centrify DirectControl ,  

12 April,16 at 11:07 AM

Applies to: All versions of Centrify DirectControl
 
Question:
Hadoop has native support for Kerberos, and is distributed using Cloudera which comes with a guide on setting up Hadoop with Kerberos here: 
 
The below link was provided as a courtesy. Centrify will not take any responsibility if these links are unavailable over time as its coming from vendor.

http://www.cloudera.com/content/cloudera-content/cloudera-docs/CDH4/4.2.0/CDH4-Security-Guide/cdh4sg_topic_3.html
 
On Step 4: Create and Deploy the Kerberos Principals and Keytab Files:
 
 
It seems like this can be achieved using Centrify's adkeytab command but there appears to be issues. 
The basics of the Hadoop setup is as follows: 
 
opbhdname001.lab.yourcompany.com 
opbhddata004.lab.yourcompany.com 
opbhddata005.lab.yourcompany.com 
opbhddata006.lab.yourcompany.com 
opbhddata007.lab.yourcompany.com 
opbhddata008.lab.yourcompany.com 
opbhddata009.lab.yourcompany.com 
opbhddata010.lab.yourcompany.com 
opbhddata011.lab.yourcompany.com 
 
Each host is already 'Centrified' for authentication against the realm, and users are logging in successfully. 
 
According to the Hadoop security documentation, there needs to be 3 service accounts on each host: 
- hdfs
- mapred
- yarn
 
Each of these service accounts should have a separate keytab file containing the service account principal and the 'host' principal.
 
For example on opbhdname001.lab.yourcompany.com there should be a keytab file at /etc/hadoop/conf/hdfs.keytab containing keys for:
 
  hdfs/opbhdname001.lab.yourcompany.com@WIN.yourcompany.com
  host/opbhdname001.lab.yourcompany.com@WIN.yourcompany.com
  
The /etc/hadoop/conf/yarn.keytab file should contain:
 
  yarn/opbhdname001.lab.yourcompany.com@WIN.yourcompany.com 
  host/opbhdname001.lab.yourcompany.com@WIN.yourcompany.com
  
The same for the mapred user and for all the other hosts in the cluster. 
 
What commands can be used by Centrify to achieve what is required by Hadoop?
 
Answer:
Please follow the steps below:
 
1. Set the following parameters in /etc/centrifydc/centrifydc.conf
adclient.krb5.service.principals: http ftp cifs nfs mapred yarn hdfs 
adclient.krb5.password.change.interval: 0 
 
2. Rejoin the domain to have the mapred, yarn and hdfs service added to SPN by: 
adleave -r 
adjoin <please fill in your adjoin syntax> 
E.g. 
  adjoin mba.local -z test -c "mba.local/Unix/servers" 
 
3. Create the service accounts for mapred, yarn and hdfs on one box (This can then be copied over to other boxes and merged): 
 
adkeytab --new -c <container> -K <keytab file location> -u <AD account that can create new account> <account> 
 
E.g. 
  adkeytab --new -c "mba.local/Users" -K /tmp/mapred.keytab -u administrator mapred 
  adkeytab --new -c "mba.local/Users" -K /tmp/yarn.keytab -u administrator yarn 
  adkeytab --new -c "mba.local/Users" -K /tmp/hdfs.keytab -u administrator hdfs 
 
4. Pack the service account keytab and deploy this to all the other hosts:
  tar -cf /tmp/service_keytab.tar /tmp/mapred.keytab /tmp/hdfs.keytab /tmp/yarn.keytab 
 
5. Deploy the tar file to each host and untar the file in /tmp 
 
*Repeat the next steps 6-8 for each host 
 
6. Copy the system default keytab into /tmp to perform the keytab merge:
cp /etc/krb5.keytab /tmp/ 
 
7. Use ktutil to perform the merge:
/usr/share/centrifydc/kerberos/sbin/ktutil 
 
  rkt krb5.keytab 
  rkt mapred.keytab 
  wkt mapred.merged.keytab 
  clear 
  rkt krb5.keytab 
  rkt yarn.keytab 
  wkt yarn.merged.keytab 
  clear 
  rkt krb5.keytab 
  rkt hdfs.keytab 
  wkt hdfs.merged.keytab 
  clear 
  exit 
 
8. Copy the merged keytab to the destination folder and change the name:
  cp /tmp/mapred.merged.keytab /etc/hadoop/conf/mapred.keytab 
  cp /tmp/hdfs.merged.keytab /etc/hadoop/conf/hdfs.keytab 
  cp /tmp/yarn.merged.keytab /etc/hadoop/conf/yarn.keytab 
 
9. Clean up the keytabs in /tmp 
  rm *keytab 
 
10. Verify that kinit -kt works 
  /usr/share/centrifydc/kerberos/bin/kinit -kt /etc/hadoop/conf/hdfs.keytab hdfs 
  /usr/share/centrifydc/kerberos/bin/klist 
 

Still have questions? Click here to log a technical support case, or collaborate with your peers in Centrify's Online Community.