000037154 - RSA NetWitness Puppet master service becomes unresponsive and stops frequently

Document created by RSA Customer Support Employee on Feb 1, 2019
Version 1Show Document
  • View in full screen mode

Article Content

Article Number000037154
Applies ToRSA Product Set: RSA NetWitness Logs and Packets
RSA Version/Condition: 10.6.x
Component: Puppet Master
Platform: CentOS
O/S Version: 6
IssueNote: The function of puppet master on the NetWitness/SA Server and the puppet agent on the RSA NetWitness host is mainly used for provisioning and the day-to-day management of configuration.

It is observed on large environments managed by a single RSA NetWitness Server that the puppet master service stops quite frequently, causing disruption to the provisioning of new hosts and causing general issues with automatic host configuration management through puppet (the errors of which look like communication errors).

This issue is most often seen in large deployments of infrastructure hosts such as Virtual Log Collectors (VLCs) occurring at the same time. This causes the puppet master service to become unstable, and subsequently affects the general communication with other puppet agents that have their configuration already under management by the puppet master. 

Puppet master stops responding due to the high number network connections being made to TCP port 8140.

When running the command “puppet agent –t” on the client, it shows errors such as:

“executing expired”

and

“SSL connection reset by peer”.



When running openssl locally on the SA server, it shows the SSL handshake is reset:
 

# openssl s_client -connect puppetmaster.local:8140
CONNECTED(00000003)
write:errno=104

no peer certificate available

No client certificate CA names sent

SSL handshake has read 0 bytes and written 245 bytes

New, (NONE), Cipher is (NONE)
Secure Renegotiation IS NOT supported
Compression: NONE
Expansion: NONE


These issues may be observed when the total number of hosts being managed by a single RSA NetWitness server exceeds 170 hosts.
If the puppet master service stops responding then no further hosts can be provisioned.

 
CauseThis is a limitation in the current deployed version of puppet infrastructure. 

When the puppet master is managing a large environment (e.g. more than 170 hosts), it may cause puppet master to not be able to response to the puppet agent requests in timely manner. This may result in failing to provision new hosts or having general communications with existing puppet agents under management.
 
ResolutionRSA recommends customers upgrade their RSA NetWitness version 10.x environment to version 11.x. In version 11, it has significant improvement to the management of larger scale deployments. The existing puppet implementation has been replaced with a better infrastructure hosted by the RSA Orchestration service.

Please also be aware that RSA 
Security Analytics 10.6 (including all 10.6.x) will reach End of Product Support (EOPS) by Oct 2019. It is recommended for customers to schedule the migration to version 11 as soon as is convenient to avoid running software that has reached end of support.
 
WorkaroundIn some customer environments, a regular restart of the puppet master service (using a mechanism such as a cron job) has been implemented to ensure the availability of this service by resetting all of the TCP port 8140 network connections and allowing the re-establishment of communication with the puppet agents.

The second option is the non standard configuration of using multiple puppet master services to facilitate a larger scale deployment of hosts. This would require multiple NetWitness/SA servers. This option would require a change of environment architecture, as it would also mean that hosts are being managed by more than one SA server, and thus would change the everyday usage of alerting and reporting. It would be recommended to engage RSA Professional Services in this instance to ensure that the multiple puppetmasters trust each other in this currently unqualified architecture.
 

Attachments

    Outcomes