In my previous blogs (Vulnerability Risk Management: Lets not boil the Ocean and ulnerability Risk Management - It is a Big Deal) in this series, I focused on how important Vulnerability Risk Management is for organizations and the need to take it beyond a compliance task. When you take that next step to use vulnerability identification and remediation as a core piece of your threat prevention strategy, key metrics must be put in place to measure the success. Vulnerabilities can be identified in the thousands on large infrastructures and without a clear sense of progress, the security and IT functions can weary very quickly of playing “whack a mole”. In security, we constantly talk about the challenges of showing return on the investment. Many times, we can use the phrase ‘you can’t measure things that DIDN’T happen’ to justify not having a clear metric for success – meaning that security protected against something bad happening and therefore it was a justified investment. However there are things that can be measured to show success in vulnerability risk management.
Metrics that Matter
Issues by Status – When a vulnerability is identified on a system the first time, it is a new data point that should inform and, depending on the situation, drive an action. When that vulnerability is found the second, third, fourth time…it indicates a gap. Either the vulnerability is either too insignificant that there is no reason to take the time to fix it or the vulnerability has been lost in the “vulnerability pit” and no one is paying attention. Either way, there should be some action (exception, escalation, tuning of the scanner, etc.) driven by that constant ping of a vulnerability. Understanding the status of a vulnerability – whether it is New, Active, Reopened, Verified, Excepted, Pending Remediation or Fixed – must be tracked over time to identify the rate vulnerabilities are moving through the process and are addressed.
Remediation Time – Once a vulnerability is identified and has been flagged to be addressed (through a configuration change, patch, etc.), then the time it sits idle (and still a danger to the system) directly impacts the “dwell time” an attacker can exploit that vulnerability and use the system to leap frog to other systems. This metric should measure the length of time from identification to remediation and is a measure of the efficiency of the patch and remediation cycle.
Scanner Coverage – Due to the constant change in vulnerabilities, organizations must look to increase their scan footprint (more devices, more often). Scanning only critical systems on a frequent basis can help protect those assets (and is a definite minimum capability) but vulnerable less critical systems can provide a significant launching point and given enough time an attacker can breach sensitive systems. Organizations should measure the number of devices (in terms of percentage of the “Device Universe”) and frequency of scans as well the mix of authenticated and unauthenticated scans. Understanding the coverage of scanning is essential to get a picture of the breadth and depth of the vulnerability scanning program.
Inadequate Fixes – Finally, the security function needs to understand the effectiveness of the patch and configuration management processes. This is a critical ‘checks and balances’ relationship between Security and IT. It does not help if vulnerabilities keep popping back up due to either bad patching efforts, system administrators ‘undoing’ configuration changes or other issues. When a vulnerability goes from “Fixed” to “Reopened”, there is an issue somewhere in the process. The patch or fix could be detrimental to the business function and therefore should drive some risk analysis and exception process. Regardless of the reason, the issue needs to be investigated and resolved.
Accurate Asset Inventory – Vulnerability Risk Management hinges on a clear understanding of the assets. If an organization can organize a view of the infrastructure and deal with DHCP, multi-homed devices, overlapping IP spaces and the other challenges associated with the large Device universe, it can be game changing. The key metric would be the percentage of vulnerabilities that can be associated with a true, cataloged business asset. The nameless, faceless, unknown IPs, and vulnerabilities associated with those IPs, represents a lot of noise. Some of that data is extremely important – the new production servers that were brought online that Security didn’t know about, for instance. So tracking this metric over time gives a clear indicator on how well Security and IT understand the business context of IT assets.
There are other metrics as well but getting these basics down can greatly improve the understanding of the efficiencies and effectiveness of the vulnerability management program. As with all metrics, it takes a commitment to consistently measure over time to produce meaningful conclusions. Watching these metrics over time can help understand where the systemic issues might be in the vulnerability program. Is the Remediation Cycle taking too long? This can indicate an issue with patching infrastructure. Does a Vulnerability sit idle in the same status too long? IT may need more prioritization assistance. Is the scanner coverage stagnant or not testing systems on a frequent enough cycle? Sounds like the scanning implementation may need a review.
The main question to ask yourself is “Can I measure, track and report on these metrics today?” If so, then you most likely have a pretty solid process in place and can provide management with the basics in terms of progress, efficiencies and effectiveness. If not, then the question is how can you put in place the right infrastructure to start measuring Metrics that Matter?