When a hardware error such as malfunctioning hardware, faulty memory, a badly written device driver, or hardware/software running beyond specified limits occurs on a Windows system, the system enters an unstable state known as a Blue Screen of Death or BSOD.
The purpose of this article is to explain the correct process of how to handle a BSOD when it is suspected that the NetWitness Endpoint agent is believed to be either the cause or victim in a BSOD crash.
1. Record the stop code 2. Reboot the system 3. Verify the agent version and upgrade agent if possible 4. Collect the event logs from windows 5. Collect the crash dump 6. Open a new case with RSA support for root cause of the BSOD
BSOD's occur as a result of different factors and have different causes. The NetWitness Endpoint agent is a security agent that runs as a background process and is essentially invisible to the user. It uses two agent modes, one that runs in user space, and one in kernel space. Either agent could be involved in the cause of the BSOD, but higher likelihood is that it was caused by the kernel mode agent which does most of the work in scanning and gathering tracking data for instance.
The NWE agent does not actually record logs, during normal operations or during a crash, so there is no logging from the agent to aid in analysis. This requires a review by engineering of the events surrounding the agent crash along with the actual processes running at the time, which are dumped into a Windows dump file during a crash and are the most informative to what the cause may be. The agent itself can be the cause, but also can be the victim of another process.
BSOD Resolution Process
If the BSOD is still on screen, record the stop message(error code number) as part of the notes to be placed in the support case later.
Reboot the machine and bring the Windows system online if not already done so.
Confirm basic information about the machine for the benefit of the support and engineering's review. This can be found in Control Panel>System and Security>System. Relevant information includes:
Windows version number(i.e. Windows 10)
RAM and CPU
64 or 32 bit OS
Verify the agent version installed on the machine. This can be seen in the NWE UI under the hostname of that machine. This is relevant because older versions of the agent will have known bugs related to BSOD's that may not be remediated in that version of the agent, but is remediated in later versions. Upgrading the agent to the most recently available version (if older than the current server version) could potentially prevent future BSOD's from happening on the endpoint under investigation.
Below are steps to gather the Windows Event logs:
Click Start and in the search field type Event Viewer
Click Event Viewer
Select the System logfile and right-click
Click Save Log File As and save it under an appropriate name. Alternately, click Save All Events As which does the same thing
Repeat for the Application logfile
Below are the steps for enabling the BSOD crash dump:
The dump may not exist, so first steps are to ensure dumps are able to be enabled. Go to Start Menu>Control Panel>System>Security. Click on System
Click on Advanced System Settings
Under Advanced>Select Settings under Startup and Recovery
Check the debugging panel. Automatic Memory dump should be selected. Options include none, Small Memory Dump, and Complete dump along with Kernel. Engineering prefers to have a Complete dump; but if this is considered too large, kernel dump is the next best choice.
Pay attention to the location of the dump file. The default is %SystemRoot%\MEMORY.DMP for many users, this is the file and file location of the memory dump.
To gather the dump, in step 6v above the dump will be in the location listed above. Gather the MEMORY.DMP file, default locations are typically C:\Windows\MEMORY.DMP or C:\MEMORY.DMP
Open a new case with support requesting the BSOD dump be reviewed for root cause
The dump file will likely be too large for upload to the case; consequently in the case notes, along with the logs and dump mentioned, it will be necessary to request an ftp link from support in order to upload the needed logs and dump files.
Additional Information to retrieve. 1. Brand and version of any Antivirus/Malware detection products installed on the client machine. 2. Installed MS updates. (step below to get the list)
Open elevated power-shell (locally) on the machine At command prompt type: get-wmiobject -class win32_quickfixengineering > Updatelist.txt (the file will be located in same directory where you ran the command)