Article Content
Article Number | 000036554 |
Applies To | RSA Product Set: NetWitness Logs & Network RSA Product/Service Type: Core Appliance RSA Version/Condition: 10.6.x, 11.x |
Issue | The RabbitMQ service is no longer starting. When you try to start the service, it shows the error:
The /var/log/rabbitmq/startup_log file shows the following error:
or [root@appliance ~]# cat /var/log/rabbitmq/startup_log ERROR: epmd error for host 529e5432-5c74-4521-8dad-1cc6a0735902: nxdomain (non-existing domain) |
Cause | When the RabbitMQ service starts, one of the first things it does is trying to resolve the hostname specified in the /etc/rabbitmq/rabbitmq-env.conf file. In Netwitness, the hostname is by default: sa@localhost for 10.6.x versions rabbit@<nodeid> for 11.x versions For example for 10.6: [root@appliance ~]# cat /etc/rabbitmq/rabbitmq-env.conf NODENAME=sa@localhost <----- ENABLED_PLUGINS_FILE=/etc/rabbitmq/rsa_enabled_plugins To resolve the hostname, RabbitMQ first tries to use the /etc/hosts file and if it fails, then tries to use the /etc/resolv.conf. In this case, "localhost" (or the <nodeid> for 11.x versions) should always be located in the /etc/hosts file. If you are getting the nxdomain (non-existing domain) error, then that means that "localhost" (or the nodeid for 11.x versions) cannot be resolved into an IP address using the /etc/hosts file. That could be caused by a typo in the /etc/hosts file or maybe because the file doesn't have the correct permissions so the rabbitmq service is not able to read the contents. The correct permissions should be: [root@appliance ~]# ls -lh /etc/hosts
|
Resolution | Open the /etc/hosts with the vi editor and make sure that there are no strange characters, that all the IP Addresses and hostnames are correct, and especially that the line starting with 127.0.0.1 includes "localhost" or the <nodeid> : 10.6.x: 127.0.0.1 LDecoder localhost localhost.localdomain localhost4 localhost4.localdomain4 11.x.x: 127.0.0.1 LDecoder localhost localhost.localdomain localhost4 localhost4.localdomain4 529e5432-5c74-4521-8dad-1cc6a0735902 ( in 11.x you can get the node id by running the command: cat /etc/salt/minion | grep id ) If the permissions are not correct, change the permissions with the command: chmod 644 /etc/hosts After fixing the typo or changing the permissions, check if there is any process still open by rabbitmq with the command: ps aux | grep rabbit You should only see a line with the "ps aux | grep rabbit" command that you just ran. If you can see any other processes related to rabbitmq, kill them with the command: kill <PID> Then run the puppet agent to automatically fix any other discrepancies an to restart the rabbitmq service: puppet agent -t |