Decoder won't start
We have an issue with the RSA Decoder/
By the monitoring system, we recieve that the service status is green and ready, but the health status says Capture stopped with the red sign, how we can identify what's the problem here and restore the normal job?
The services had been restarted several times.
Here are the most common causes I've come across. If neither of these solve your problem, I'd suggest opening a support case. RSA Customer Support
First and foremost, you should read through the /var/log/messages file on the decoder. It should tell you why capture is stopping. Also ensure that either "capture autostart" is turned on, or that you've manually clicked "start capture."
- Partition full or minimum free space exceeded. This can be due to other files in the decoder database directory consuming space, typically core files. If your partition is more than 95% full, this may be the issue. Try these commands.
find /var/netwitness -name "core.*"
If you find any core files, check their size and what time they were created. If they are old, it's probably safe to delete them. If they're new, there might be other issues. It may also be worth checking your /var/log/ partition. If that's full, it might be this issue: 000037185 - RSA NetWitness Platform 11.x /var/log mount is full due to logstash directory
- Interface issues. This is less common, but it happens. If you've made any changes to your capture settings or 10G capture, ensure that your settings are correct. Decoder: Configure 10G Capability
Aleksey, can you do a 'tail -f /var/log/messages' at the command line, then click "Start Capture" on your decoder. Watch those logs as they scroll through starting up, and then as it fails. Copy those logs as it fails and post them here for us to see.
This is the message that relates to my reason one posted earlier:
[Decoder] [warning] Session database free space threshold exceeded (/var/netwitness/decoder/sessiondb, 889.42 MB free), capture is stopping. Please check drive and configuration.
The most likely possibility is that there are core dumps or other non-database files in that directory. These are not trimmed by the size-roll schedule.
Thank you for your help, i found 2 mount points which are full for 95% and 100%, but there were no core files, just for clarification, do i understand it correctly, that just these mount points are full of logs and there's no more space to write new ones and that's why it can't start?
Can you post the full "df -h" output? What kind of appliance is this?
The meta, session, and packet mount points contain database files, not logs. Normally, they fill up to 95%. After hitting 95%, old files will be deleted to make room for new ones, but they should stay at 95%.
If your session partition has hit 100%, that likely means either:
- there are non-database files in there
- the data roll-off function isn't running
- the partition is sized too small
In option three, the size of new sessiondb files may be too big, thus causing the free space minimum to be violated before the roll-off function runs. The two solutions are to increase the partition size or reduce the sessiondb file size. Both of these should be done with guidance from support.
There should be messages in /var/log/messages containing the string "System is deleting session entries" and then list the sessions being deleted and their size. To me, your partition size looks okay.
At this point, I'd suggest getting Customer Support involved (RSA Customer Support ). I am on a different team so they'll be able to set up a meeting to troubleshoot with you on the live system. In the future, if you have any issues that are impacting performance, you should either open a case using that link or you can call one of the listed phone numbers. This community forum is more for non-urgent questions and is not considered a case.
I see that one of your colleagues did open a case so I've asked the support team to reach out to you.