Centrify DirectControl 5.1
After decommissioning a Domain controller, ssh login times are slower than expected. After checking SRV records, DNS etc, the issue was narrowed to /var/centrifydc/kset.dc.domain.com still had the decommissioned DC in the GC, SMB and NTP lines.
The agent was stopped and after renaming the kset.dc file, & restarting agent, the DC was no longer present and
SSH worked fine.
Centrify servers running DirectControl 4.4.x were unaffected.
Is there any reason why the agent did not switch to another DC?
/var/centrifydc/kset.dc.domain is not a hardcoded file. Its built by adclient and we update it when we do a netstate call.
The code is resilient in the sense that when it detects if a DC cannot be used; it will go out to seek other DCs out there within 30 secs.
With regard to updating servers in /var/centrifydc/kset.dc.<domain> files:
These files are organized by services required:
There is a piece of code called the DC vending machine. Clients of the DC vending machine can ask for servers that offer specific services (KDC, KPassword, LDAP, SMB, NDP, GC...).
As new servers are found for specific services they are updated in the kset file. It is normal for the file to have a mix of
servers (even old dead ones if no one is asking for a particular service - for instance SMB services).
Centrify DirectControl 4.4.3 did not support the level of server vs service granularity that 5.x does
(4.4.3 only remembered one server for all services).
Based on network trace, logs provided by customer, it was observed that adclient attempted to find and connect to servers
in all trusted domains. The reason it does this is so it is ready to authenticate users in trusted domains if the user is
presented to them.
Centrify DirectControl 5.1.0-497 has a known issue. In the internal tests, it was noticed that after shutting down the DC
that adagent is connected to and then running adinfo, the kset.dc.domaincontroller file gets partially updated - in particular the GC entry does not get updated.
During this time, if a login is attempted as a user using SAM account name, it takes a long time - on the order of a minute.
This is because adagent is still trying to use the retired DC as its GC and needs to figure out its no longer there
But after the delay for the first login, the GC entry in the kset file gets updated and all logins after that are fast.
1) Navigate to /var/centrifydc/
2) Rename or delete the kset.dc.<domain> file
3) Restart centrifydc so that it gets rebuilt.
If there are domains which are not accessible to adclient, it will waste a lot of time trying to find a live server. In this case
its best to black-list these domains (or white list the domains you only want adclient to look at).
To blacklist a domain, customer can set the below parameter in /etc/centrifydc/centrifydc.conf
adclient.excluded.domains: anvil.acme.com cayote.acme.com
where anvil.acme.com cayote.acme.com need to be replaced by the actual domain names. Customers should make sure
that there are no users coming from this domain.
Customers running into this issue should upgrade to Centrify DirectControl 5.1.2 or higher. After upgrade, they should
remove the blacklisted domains mentioned in the workaround.