Question:
Hadoop has native support for Kerberos, and is distributed using Cloudera which comes with a guide on setting up Hadoop with Kerberos here:
On Step 4: Create and Deploy the Kerberos Principals and Keytab Files:
It seems like this can be achieved using Centrify's adkeytab command but there appears to be issues.
The basics of the Hadoop setup is as follows:
opbhdname001.lab.yourcompany.com
opbhddata004.lab.yourcompany.com
opbhddata005.lab.yourcompany.com
opbhddata006.lab.yourcompany.com
opbhddata007.lab.yourcompany.com
opbhddata008.lab.yourcompany.com
opbhddata009.lab.yourcompany.com
opbhddata010.lab.yourcompany.com
opbhddata011.lab.yourcompany.com
Each host is already 'Centrified' for authentication against the realm, and users are logging in successfully.
According to the Hadoop security documentation, there needs to be 3 service accounts on each host:
- hdfs
- mapred
- yarn
Each of these service accounts should have a separate keytab file containing the service account principal and the 'host' principal.
For example on opbhdname001.lab.yourcompany.com there should be a keytab file at /etc/hadoop/conf/hdfs.keytab containing keys for:
hdfs/opbhdname001.lab.yourcompany.com@WIN.yourcompany.com
host/opbhdname001.lab.yourcompany.com@WIN.yourcompany.com
The /etc/hadoop/conf/yarn.keytab file should contain:
yarn/opbhdname001.lab.yourcompany.com@WIN.yourcompany.com
host/opbhdname001.lab.yourcompany.com@WIN.yourcompany.com
The same for the mapred user and for all the other hosts in the cluster.
What commands can be used by Centrify to achieve what is required by Hadoop?
Answer:
Please follow the steps below:
1. Set the following parameters in /etc/centrifydc/centrifydc.conf:
adclient.krb5.service.principals: http ftp cifs nfs mapred yarn hdfs
adclient.krb5.password.change.interval: 0
2. Rejoin the domain to have the mapred, yarn and hdfs service added to SPN by:
adleave -r
adjoin <please fill in your adjoin syntax>
E.g.
adjoin mba.local -z test -c "mba.local/Unix/servers"
3. Create the service accounts for mapred, yarn and hdfs on one box (This can then be copied over to other boxes and merged):
adkeytab --new -c <container> -K <keytab file location> -u <AD account that can create new account> <account>
E.g.
adkeytab --new -c "mba.local/Users" -K /tmp/mapred.keytab -u administrator mapred
adkeytab --new -c "mba.local/Users" -K /tmp/yarn.keytab -u administrator yarn
adkeytab --new -c "mba.local/Users" -K /tmp/hdfs.keytab -u administrator hdfs
4. Pack the service account keytab and deploy this to all the other hosts:
tar -cf /tmp/service_keytab.tar /tmp/mapred.keytab /tmp/hdfs.keytab /tmp/yarn.keytab
5. Deploy the tar file to each host and untar the file in /tmp
*Repeat the next steps 6-8 for each host
6. Copy the system default keytab into /tmp to perform the keytab merge:
cp /etc/krb5.keytab /tmp/
7. Use ktutil to perform the merge:
/usr/share/centrifydc/kerberos/sbin/ktutil
rkt krb5.keytab
rkt mapred.keytab
wkt mapred.merged.keytab
clear
rkt krb5.keytab
rkt yarn.keytab
wkt yarn.merged.keytab
clear
rkt krb5.keytab
rkt hdfs.keytab
wkt hdfs.merged.keytab
clear
exit
8. Copy the merged keytab to the destination folder and change the name:
cp /tmp/mapred.merged.keytab /etc/hadoop/conf/mapred.keytab
cp /tmp/hdfs.merged.keytab /etc/hadoop/conf/hdfs.keytab
cp /tmp/yarn.merged.keytab /etc/hadoop/conf/yarn.keytab
9. Clean up the keytabs in /tmp
rm *keytab
10. Verify that kinit -kt works
/usr/share/centrifydc/kerberos/bin/kinit -kt /etc/hadoop/conf/hdfs.keytab hdfs
/usr/share/centrifydc/kerberos/bin/klist