000030822 - Cannot start collectd after upgrading to RSA Security Analytics 10.5.0.0 from 10.4.0.2

Document created by RSA Customer Support Employee on Jun 14, 2016Last modified by RSA Customer Support Employee on Apr 21, 2017
Version 2Show Document
  • View in full screen mode

Article Content

Article Number000030822
Applies To
RSA Product Set: RSA Security Analytics
RSA Product/Service Type: Core Appliance, Event Stream Analysis (ESA), Malware Appliance, Archiver, Security Analytics Server
RSA Version/Condition:  10.5.0.0
Platform: CentOS
Platform (Other): collectd
O/S Version: Enterprise Linux 6
IssueAfter updating the appliance to version 10.5.0.0 from 10.4.0.2 collectd will not start, statistics in Health & Wellness are not populated and the following errors are observed in /var/log/messages:
 
Jun 24 18:07:28 Decoder yum[29027]: Updated: rsa-collectd-5.4.1.2979-5.el6.x86_64
Jun 24 18:07:28 Decoder yum[29027]: Installed: rsa-collectd-sms-10.5.0.0.2979-5.el6.x86_64
Jun 24 18:18:08 Decoder collectd[2471]: An error occurred publishing a statistic for plugin 97ce0735-b873-4be3-a4de-f82bdac4c154/sms_collectd.MessageBusWriteModule-counter-published.  Error: An error occurred publishing an AMQP Message.  Exchange name: carlos.sms.collectd; error: a socket error occurred; message size: 317
Jun 24 18:18:09 Decoder collectd[2471]: An error occurred publishing a statistic for plugin 97ce0735-b873-4be3-a4de-f82bdac4c154/decoder_decoder-gauge-assembler.packet.pages.  Error: An error occurred publishing an AMQP Message.  Exchange name: carlos.sms.collectd; error: An error occurred creating an AMQP Channel.  Configuration: {#012    "urn": "carlos.sms.collectd",#012    "connection":#012    {#012        "vhost": "\/rsa\/system"#012    }#012}#012; error: a socket error occurred; message size: 312
Jun 24 18:18:27 Decoder collectd[2471]: restreader.py: Unable to Connect to Endpoint.  Endpoint config: {'username': 'guest', 'password': '********', 'path': 'api/nodes', 'verify': False, 'scheme': 'https', 'port': 15671}; error: [Errno 111] Connection refused
CauseThe collectd service is unable to start because the /etc/collectd.conf file was removed during the Security Analytics 10.5.0.0 upgrade. As a result, puppet didn’t finish its run and collectd won’t be able to start.
WorkaroundTo resolve the issue, perform the steps below:
  1. Restore the collectd.conf file from its backup.
    cp /etc/collectd.conf.rpmsave /etc/collectd.conf

  2. Perform a puppet catalog run.
    puppet agent -t

  3. Tail the /var/log/messages file to ensure that the errors are no longer occurring.
    tail -f /var/log/messages

If you are unsure of any of the steps above or experience any issues, contact RSA Support and quote this article number for further assistance.
NotesIf an error occurs when performing Step 2 above, issue the two commands below to clear the lock and attempt to catalog run again.
rm /var/lib/puppet/state/agent_catalog_run.lock
puppet agent -t

Attachments

    Outcomes