An existing DirectAudit agent is online and responding however it is no longer sending data to the collector. Other agents attached to the same collector are streaming without issue.
Applies to: Centrify DirectAudit Agent 3.2.0 and below on all supported platforms
Problem: The following behavior is observed on a machine running the Centrify DirectAudit Agent. Upon running dainfo the agent is showing as connected however size of the offline store (spool) is higher than normal and not decreasing. You already attempted to restart the agent however the same behavior is observed. The offline store also has plenty of free disk space. You have other DirectAudit agents running and are spooling to the same collector without issue.
[root@server1 ~]# dainfo Pinging adclient: adclient is available Daemon status: Online Current collector: collector1.ocean.net:5063:HOST/COLLECTOR1.ocean.net@OCEAN.NET Offline store size: (>1MB) Despool rate: 0.00 Bytes/second Getting offline database information: Size on disk: (>1MB) Database filesystem use: (sufficient disk space) DirectAudit NSS module: Active User (root) audited status: Yes DirectAudit is not configured to audit individual commands.
Additionally the spool may contain critical data so it may be required to have an option to preserve the spool.
Cause: There is a known issue in the Centrify DirectAudit Agent for UNIX/ Linux where a “dirty shutdown” (another term for an improper agent shutdown) can result in the offline store becoming corrupt and the agent has no way to recover it due to the way it linearly processes the offline store files. Although the agent has a connection to the collector open since it can only stream data in the order it was recorded new session data cannot be sent to the collector.
Workaround: Use either option below depending on the importance of preserving as much of the existing session data as possible:
Option 1) The offline store is not needed and may be purged.
Option 2) The offline store is crucial and must be recovered to the extent possible.
In this situation please follow the steps in both sections below to recover the offline store:
Section A) Prepare to recover the offline store (spool)
1. Download the following attached files which are needed for this recovery process: a. Repairdbq.zip – A compressed archive containing the offline store repair utility binary for all supported DirectAudit agent platforms. b. Swapin.sh – A script to move the offline store from the temporary location used for repair operations back to the active offline store location.
Section B) Repair and move the corrupt offline store (spool) back to the active location
1. Disable DA auditing. - Run dacontrol -d
2. Close the session and login with a non-audited session (run dainfo to confirm).
3. Make sure all other sessions are closed out and there is no auditing being done on the machine.
4. Backup the old DBQC (/var/centrifyda/spool-dbqc-bak) to another location as a precaution (you may perform this prior to step 1 to minimize downtime). - For example - cp /var/centrifyda/spool-dbqc /var/centrifyda/spool-dbqc-bak
5. Copy the repairdbq utility to the old DBQC (/var/centrifyda/spool-dbqc-bak) folder and run repairdbq to ensure it will stream once merged into the active spool: Run: sudo ./repairdbq /var/centrifyda/spool-dbqc-bak/* - Please note the * at the end as the utility will repair one file at a time - If a file is repaired, the original copy is kept in /var/centrifyda/dbqc-tmp/, with ".bak" at the end of filename. - DBQ files may become much smaller after repair so expect this especially for the low #'s. This is because a DBQ file may be sparse and the utility removes empty spaces. - Finally if a file does not need repairing, the original file is left untouched.
Before proceeding verify you have the following: (1) /var/centrifyda/spool-dbqc-bak/*. The DBQ files are all repaired. (2) /var/centrifyda/spool-dbqc/* is being used using by the DA agent (check for recent files).
6. Stop the DirectAudit agent (dad) - Stop dad by running system tools (e.g, /etc/init.d/centrifyda stop or dastop), Do Not run the kill -9 command on the dad process as this is equivalent of a "dirty shutdown".
7. Backup /var/centrifyda/spool-dbqc - Another precaution
8. Copy and run the swapin script (from any location) as root: ./swapin.sh /var/centrifyda/spool-dbqc-bak /var/centrifyda/spool-dbqc - The script will print the commands it uses to move and rename the DBQC files. - When everything is done correctly it prints "DONE".
9. Enable DirectAudit auditing - Run dacontrol -e
10. Start dad - Start dad by running system tools (e.g, /etc/init.d/centrifyda start or dastart)
Resolution: The condition which can result in a corrupted offline store after a “dirty shutdown” has been addressed in Centrify Suite 2014.1 (DirectAudit Agent version 3.2.1). Please upgrade to this version (and corresponding DirectControl version) to decrease the risk of a corrupted spool occurring again on your system.
As a best practice where possible we recommend upgrading the DirectAudit agent prior to repairing the spool first so the system has all the latest bug fixes including the new added dirty shutdown protection features.