Applies to: Centrify DirectControl version 5.1.2 on all platforms
Problem:
Centrify DirectControl version 5.1.2 introduced a new krb5 cache type called KCM (Kerberos Cache Memory).
When using this new krb5 cache type, in some occasions, "top" output shows that kcm is using ~100% CPU:
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
9477 root 20 0 22812 1852 1108 R 99.8 0.0 12878:21 kcm
18687 l911849 20 0 875m 733m 1124 S 9.9 3.1 3299:11 tmux
22019 root 20 0 21328 2384 1140 R 0.7 0.0 0:00.07 top
10650 l515300 20 0 96144 2816 1832 S 0.3 0.0 0:01.01 sshd
22172 root 20 0 0 0 0 S 0.3 0.0 0:00.04 flush-253:0
26859 root RT 0 674m 7048 4272 S 0.3 0.0 1:48.71 corosync
29944 root 20 0 311m 12m 7116 S 0.3 0.1 0:01.49 httpd
29951 root 20 0 59564 15m 1600 S 0.3 0.1 1:20.31 ruby
Sample output of strace shows:
poll([{fd=5, events=POLLIN}, {fd=7, events=POLLIN}], 2, -1) = 1 ([{fd=7, revents=POLLIN|POLLERR|POLLHUP}])
poll([{fd=5, events=POLLIN}, {fd=7, events=POLLIN}], 2, -1) = 1 ([{fd=7, revents=POLLIN|POLLERR|POLLHUP}])
Cause:
This is a bug in KCM. KCM does not properly handle corrupted IPC connections, which could be caused by interrupted kinit/klist/kdestroy operations or force-killing adclient.
If a connection is broken when the server has pending data read/write, the socket will be ignored from closing. Then the server will continuously get POLLERR | POLLHUP (as shown in the above strace output) revent on the socket but never close it. Therefore the process is always busy polling and utilizes ~100% CPU.
Workaround:
If noticed the CPU usage problem, try to restart KCM to lower the CPU usage of kcm:
/usr/share/centrifydc/bin/centrify-kcm restart
(Please note that all tickets in KCM will be erased during kcm restart)
Resolution:
This issue has been fixed in Centrify Suite 2014 (aka 5.1.3).