Tools and troubleshooting: calculating the index utilization
Security Analytics relies massively on indexed data when running a query to ensure high performance and flexibility upon both investigations and reports. Whatever condition is set into the where clause, this will be run across the index database looking for a match so saving the solution from digging into the raw data multiple times and resulting in instant access to your security events.
In order to make the implementation of this mechanism realistic, the platform has to be told how many unique values each meta key is supposed to hold until the index is saved next (which is every 8 hours by default). This setting is stored along the definition of each meta key into the index-concentrator and index-archiver xml file, within the ValueMax flag, which defines the maximum number of unique values a key can assume for each time slice. When the buffer is full, newest data are prevented to enter the database hence the importance to monitor carefully the index utilization to ensure no relevant keys are approaching ValueMax.
To help in achieving this objective, I’m attaching a simple script which generates a profile for each indexed key based on the ValueMax configured and the unique values currently used for each key and returns the percentage of utilization of the buffer in a descending order. This could support in identifying immediately the meta keys which may require a bigger buffer or which are storing too many unique values when they shouldn’t.
In order to set up the script, a perl interpreter and wget are required so it could be run on a Security Analytics appliance or in any box with perl and wget installed. To configure it, open the script, set in the variables at the top the IP of the concentrator and an admin username and password. When run, the script will connect to the concentrator’s REST API, pulls out the required data and generates the profile which is returned to the user’s screen.
A sample output will follow:
service: 10.67% (8/75)
action: 1.50% (15/1000)
ip.proto: 1.17% (3/256)
medium: 1.00% (1/100)
did: 0.39% (1/256)
content: 0.05% (23/50000)
country.dst: 0.05% (5/10000)
city.dst: 0.04% (20/50000)
extension: 0.04% (21/50000)
error: 0.03% (14/50000)
Disclaimer: please DO NOT consider what is described and attached to this post as RSA official content. As any other unofficial material, it has to be tested in a controlled environment first and impacts have to be evaluated carefully before being promoted to production. Also note the content I publish is usually intended to prove a concept rather than being fully working in any environments. As such, handle it with care.
- Community Thread
- Forum Thread
- RSA NetWitness
- RSA NetWitness Platform
Davide - thanks for sharing this useful script! Since the execution of the script is run at a specific point in time, it seems most logical to run this script just before the 8 hour index slice starts over (so as to run it when the metrics would be at their highest). Do you know what those exact times are configured for or is there a way we can check? You mentioned every 8 hours but I don't want to assume that they start at 00:00 each day.
Thanks in advance
Hi, I think this is a very good point. The index is saved by the scheduler which is configured through the sceduler.ini file. However, I'm not sure if it works based on the system clock (which I doubt) or if it starts counting as the service starts. If it is the latter as I guess, it would be harder to sync the script with the scheduler.
Thanks Davide for the script. I also noted that the re-indexing event that occurs via the scheduler.ini file will be logged in /var/log/messages on the concentrator with an entry similar to the following:
Apr 7 11:44:43 serverName nw: [Scheduler] [info] Running task index with message save () - 28800 secs waited
Note the seconds at the end indicate that it is doing this re-index after waiting 28800 seconds (8 hours). The log entry is saved in UTC, so there is a bit of conversion you have to do. Our script indicates hours and seconds until next re-index, and hours and seconds since last re-index.
Again, we appreciate! Thanks!
I generally look at the time stamps in the index directory and work from that.
Note that if you change the index slice time interval for performance, you'll need to re-evaluate these numbers.