000028992 - Security Analytics 10.X log collector stops processing logs

Document created by RSA Customer Support Employee on Jun 14, 2016Last modified by RSA Customer Support Employee on Apr 21, 2017
Version 2Show Document
  • View in full screen mode

Article Content

Article Number000028992
Applies ToRSA Security Analytics 10.x Log Collection
IssueThe following message is repeated when the log collector stops processing logs in /var/log/messages:
nw[14193]: [MessageBroker] [warning] warning 2014-10-30T04.13.43Z disk resource limit alarm cleared on node sa@localhost
nw[14193]: [MessageBroker] [info] info 2014-10-30T04.13.44Z Disk free space insufficient. Free bytes:2536435712 Limit:2539018480
CauseThe rabbitmq filesystem (/var/netwitness/logcollector) has filled up possibly due to connectivity issues with the LogDecoder.
Check /var/netwitness/logcollector/rabbitmq/mnesia/sa\@localhost/msg_store_persistent/ for *.rdq files building up

ResolutionFollow the steps below to resolve the issue.
1. Stop rabbitmq-server and nwlogcollector services
         service rabbitmq-server stop
         stop nwlogcollector
2. Decrease the value of disk_free_limit in /etc/rabbitmq/rabbitmq.config by editing the file and setting the value to half the current value.
3. Start rabbitmq-server
         service rabbitmq-server start
         Do Not start the nwlogcollector service at this time as we don't want to continue to create *.rdq files until we're certain that the old ones are being processed and consumed
4. Check that the *.rdq files are beginning to process (ls -l /var/netwitness/logcollector/rabbitmq/mnesia/sa\@localhost/msg_store_persistent/ | wc -l).
         Verify that the number of files is decreasing.
5. If the *.rdq files are not processing then resolve connectivity issues between VLC and LD
6. If the number of *.rdq files is decreasing then let the service run until enough *.rdq files have been processed and sufficient disk space has been recovered in the filesystem
7. Once enough disk space has been recovered stop rabbitmq-server service
         service rabbitmq-server stop
8. Restore the /etc/rabbitmq/rabbitmq.config 'disk_free_limit' variable to it's previous value. (In most cases 20% of the size of the filesystem is reasonable)
9. Start rabbitmq-server and nwlogcollector services
         service rabbitmq-server start
         start nwlogcollector
10. Verify that new *.rdq files are being created and the oldest *.rdq files continue to be processed and consumed.
NotesIt is important to check that insufficient disk space is actually the cause of the issue. If you increase the disk space available then you may just be delaying the time until the disk alarm fills up again. If the disk space is constantly being exceeded then this means that too many messages are arriving into the system than can be sent out and so a backlog is being generated.
Please see other solution articles such as