Steve Schlarman

Security Operations Management: Metrics that Matter

Blog Post created by Steve Schlarman Employee on Feb 13, 2014

In my previous blog series on Vulnerability Risk Management, I included a post on “Metrics that Matter”. I made the statement that in security, we constantly talk about the challenges of showing return on the investment. Security Operations is one of these areas that can be hard to show a return.  If you have prevented an attack through an efficient identification, escalation and quarantine process, how can you estimate the damage that was avoided? In other words, if you spent $X on creating a streamlined efficient detection and response process, how can you balance that against the $Y of losses that you prevented.   In this case, you may never know what losses were prevented.  However, there are certain tangible metrics that can give insight into how well the security operations strategy is playing out.


Metrics that Matter

Accurate Asset Inventory – I explained this in terms of Vulnerability Risk Management – and the same holds true for Security Operations: The priority and proper handling of security events hinges on a clear understanding of the assets.   Security Operations must have insight into the business value of assets to stay ahead of the curve.   While security is many times completely dependent on the weakest link, and the weakest link isn’t always the host holding the crown jewels, the closer the ninjas come to accessing critical business assets drives the urgency of the incident handling.   If Security Operations has an understanding of the infrastructure in terms of the business it services, it can be game changing.  The key metric would be the percentage of incidents that can be associated with a true, cataloged business asset.  Tracking this metric over time gives a clear indicator on how well Security Operations has that insight to prioritize security events and protect the most valued business assets.


Incident Throughput/Workload – When a security event triggers some response, the time it takes to escalate that alert from the generating system to eyes on glass is critical.  Then, how fast that event is understood, prioritized and resolved can make all the difference between a minor security incident and a significant breach.   In addition, the frontline analysts – those eyes on glass – must have the time and bandwidth to do more than fire fight. This breaks down into a few key metrics – duration from time of alert to escalation/identification, time to resolution, number of incidents per analyst and overall analyst workload (time spent on individual incidents). These metrics will give insight into the overall throughput of your security operations cycle.  Each stage of an incident (first analysis, second analysis, resolution time, etc.) must be tracked over time to identify the rate security events are moving through the process and are addressed.


Remediation Time – Security Operations will not own the remediation or resolution of every security alert.  At times, certain actions will need to be transitioned to external teams.   Once an incident indicates some external action needed e.g. a host re-imaged due to virus infection, a configuration change or patch needed, the time it takes for these actions to come to closure is an indicator of how efficient adjacent processes are.   Security Operations should understand these metrics to ensure they are handing off the necessary information to help other groups to close possible vulnerabilities. This metric should measure the length of time from identification to remediation and is a measure of the efficiency of the hand off to external processes and the true time it takes to reduce security risks.


Control Efficacy – I dedicated the last blog to highlight the role Security Operations can play in providing tangible evidence of control effectiveness.   This metric cannot be undervalued.  A Security Operations team’s role in tracking and measuring controls that work – and DON’T work – is critical in really understanding where the organization is succeeding and failing in reducing security risk.  A readout on a regular basis, with the constant, living catalog of security controls and analysis, gives material insight into the effectiveness of the overall security controls program.


SOC Program Management – Finally, Security Operations should be a constantly evolving and improving discipline within the organization.  Measuring the key metrics of the operations such as percentage of threat categories that have documented triage procedures, percentage of SOC personnel that are maintaining competency via training and continuing education and consistent shift handover and communication are ways to identify areas of improvement for the overall SOC program.


The more formal and disciplined the Security Operations function becomes, the more metrics that can, and should, be tracked and measured.  Metrics greatly improve the understanding of the efficiency and effectiveness of the threat detection and response process within the organization. As with all metrics, it takes a commitment to consistently measure over time to produce meaningful conclusions and understand where the systemic issues are and possible areas of improvement.


As with all metrics programs, the main question to ask yourself is “Can I measure, track and report on these metrics today?”  If so, then you most likely have a pretty solid process in place and can provide management with the basics in terms of progress, efficiency and effectiveness. If not, then the question is how can you put in place the right infrastructure to start measuring Metrics that Matter?


To find out how your Security Operations Management team can measure these metrics, research our new module or contact your RSA representative.