000034741 - Duplicate AgentID is causing connection errors in RSA NetWitness Endpoint

Document created by RSA Customer Support Employee on Feb 20, 2017Last modified by RSA Customer Support Employee on Apr 21, 2017
Version 3Show Document
  • View in full screen mode

Article Content

Article Number000034741
Applies ToRSA Product Set: NetWitness Endpoint (ECAT)
RSA Version/Condition: 4.1.x, 4.2.x, 4.3.x
Platform: Windows
IssueWhen searching for an agent, it never appears in the Machine's List of the UI, or else it disappears and when searching by agentID the ID constantly rotates over time. A machine may never appear at all in the GUI when searching for it and may be associated with connection issues, when in fact there are no actual issues connecting to the agent.
CauseA gold image or VM template was created at a customer site with the ECAT agent pre-installed as part of the deployment. This gets pushed out to X number of machines, which in turn causes many agents who share the same agentID in the database. The scan4 files get merged together, causing incomplete agent data, agents that do not appear in the list of machines when searching, and the agents entries in the database that have this issue become unreliable for investigations since different machines are mixed together with their data.
It is not currently supported to have gold images with the agent pre-installed, although changes to the agent in light of this issue are being investigated.
Identifying Duplicate Agents

1. Run the attached SQL script CaptureIPUpdated.sql below, note the last three lines are commented out for removing the trigger and script later. Leave them commented out for now, so the trigger and table get created.

  • A new table called CaptureAgents will be created in the ECAT$Primary with a trigger that will grab hostnames as they are being overwritten.

2. Wait 48 hours at a minimum to gather enough entries to reliably determine that there are many duplicates.


3. Review the contents of the CapureAgents table by running SELECT * FROM CaptureAgents in SQL Studio. This will provide a list of agents captured for review. Its organization looks like this(This is a sample that has been cleared):
AgentID                                                                   MachineName         OldMachineName   ChangedDate
12345678-1234-ABCD-1234-123456ABCDEF     NEWHOSTNAME    OLDHOSTNAME        2017-01-23 18:18:54.9330000
NOTE: Be careful interpreting the results; for instance, a single instance of a hostname change could possibly be legitimate if hostnames are being changed in the environment. Additionally, the OldMachineName field is important, because it may contain entries of 'Unknown'. These indicate new machines that have been added and should not be included, because its expected new entries will happen over time.
ADDITIONAL: Look for the same hostname repeating often; these are certain indicators of a duplicate agentID. The ID will be the same, and the hostname will bounce back and forth. It may cycle over several hostnames, but the agentID will always be identical for these hostnames.



How to remove the duplicate AgentID's

1. Run the script ParseDuplicateAgents.sql to parse through the contents of the CapturedAgent table which is ParseDuplicateAgents.sql. This will generate a single list of hostnames that can be used with SCCM for instance to replace the hosts that are receiving duplicates. Copy this list to a text file.
2. Download the AgentID Scrambler file attached to this article, which is a .bat file
3. Edit the .bat file in Notepad or some other editor and where it says
SET _servicename=<insert_name_of_agent_service_here> replace with the name of the service for the ECAT agent
4. Run the bat file against the machines list generated in step 1 using SCCM or a similar tool. Note that in order to avoid errors in the script, it must be able to access 64-bit binaries, meaning it should be ran under SCCM with the Sysnative option, i.e.: C:\Windows\Sysnative\cmd.exe /C ecat_Uninstall_agentid_scrambler.bat
5. You should see the following output for each of the endpoints the .bat file is ran against, the first message is a sanity check for the first registry key which should already be removed:
Starting ECAT_AgentID_Scrambler.bat
Verifying existence of the following key:
ERROR: The system was unable to find the specified registry key or value.
It appears the value in that key does not exist.
This is good since the goal of this script is to delete it.
Verifying existence of the following key:
    ServiceUid    REG_BINARY    F862D0EA6227A742B1B19ECE4AE35EBC
Trying to erase ServiceUid value in key...
The operation completed successfully.
The above key was successfully deleted.
Scrambling completed SUCCESSFULLY.

NOTE: The second registry key in Temp SHOULD have a message confirming its deletion, if not it was not ran correctly or the endpoint never had the agent installed on it.

6. This will also remove the ECAT agent. If using SCCM, it should be possible to perform two actions, the first to run the .bat file against the agent list, and the second to install the agent using the agent packager. Regardless, the agent will need to be reinstalled on each of these endpoints.
7. Confirm in the UI that each endpoint is showing up within an hour or so. How long it takes to merge depends on the speed of the database, number of agents being replaced, and network connectivity, so it may show within minutes in the UI or take longer.
8. Once all duplicates have been removed from the environment, the last 3 commands(they will be commented out in the script) from the CaptureIPUpdated.sql script should be ran to clear the CapturedAgent table, remove the trigger, and delete the table as part of cleanup. If more duplicates are suspected to still exist, run the first command to delete the contents of the table to begin the process of searching out any remaining agents over again from step 1.

NotesBe careful when interpreting the results of the script. When comparing the agentID's, single hostname changes that don't repeat over multiple days are probably not worthy of being considered; the important results are many agents with the same ID(a dead giveaway) or repeated changes to the hostname. If the hostname only changes once for an agentID and not again then its likely it was a legitimate action by a system administrator to modify the hostname.