This topic provides an overview of the Security Analytics Core database. The Security Analytics Core services contain a proprietary database developed specifically for use within the Security Analytics suite of products. It bears little resemblance to traditional relational databases, and is not based on any off-the-shelf database technology. As such, many users find that there is a steep learning curve to understanding how the Core database works, and how to make best use of it. The purpose of this guide is to help Security Analytics users understand the database and use it to its fullest potential.
As a System Administrator, you can use this information to help plan your Security Analytics deployment, and to tune it for best performance. As an Analyst, you can use this guide to structure your analysis in ways that will return reports faster. As a Content Developer, you can use this guide to help write content that will be processed efficiently by the database system.
Security Analytics Products Covered by this Guide
This guide covers the capabilities of Security Analytics 10.6. The following Security Analytics components contain the Core database:
- Log Decoder
Definitions for terms that are used throughout this document are presented here. The terms are listed in the order in which they enter the Security Analytics system:
- Packet DB: The packet database contains the raw captured data. On a Decoder, the packet database contains packets as captured from the network. Log Decoders use the packet database to store raw logs. The raw data stored in the packet database is accessible by a Packet ID, however, this ID is typically never visible to the end user.
- Packet ID: A number used to uniquely identify a packet or log in a packet database.
- Meta DB: The meta database contains items of information that are extracted by a Decoder or Log Decoder from the raw data stream. Parsers, rules, or feeds can generate meta items.
- Meta ID: A number used to uniquely identify a meta item in the meta database.
- Meta Key: A name used to classify the type of each meta item. Common meta keys include ip.src, time, or service.
- Meta Value: Each meta item contains a value. The value is what each parser, feed, or rule generates.
- Session DB: The session database contains information that ties the packet and meta items together into sessions.
- Session: On a packet Decoder, a session represents a single logical network stream. For example, a TCP/IP connection is one session. On a Log Decoder, each log event is one session. Each session contains the references to all the Packet IDs and Meta IDs that refer to the session.
- Session ID: A number used to uniquely identify sessions in the Session DB.
- Index: The index is a collection of files that provides a way to look up Session IDs using Meta Values.
- Core Database: This refers to the combination of the Packet, Meta, Session, and Index.
For syntax definitions, this document uses EBNF grammar definitions.
Security Analytics Core Database History
NetWitness developed the Security Analytics Core database for use in packet capture systems. Early in the history of NetWitness, developers identified that existing database technologies would not be able to keep up with the high ingest rate inherent in full packet capture. Contemporary database technologies were not anywhere close to being able to keep up with capturing the number of sessions received every second, much less sorting every packet. Likewise, the volume of data meant that packet storage would need to be discarded and reused just as quickly as it was consumed. This was also a weakness of databases at the time. Thus, NetWitness created a database consisting of the packet, session, and meta databases.
In order to provide the analytical capabilities of NetWitness Investigator, a meta index was added to the NetWitness database. The index shared the same design goals as the original databases. It was designed to sustain a very high insert rate into a high number of very large indices.
The index has evolved considerably over the years. Early versions of the index were only capable of providing summary estimates about how many unique meta values were present in the meta database. Other versions have had great challenges in meeting acceptable query performance. For example, NetWitness 9.0 more frequently measured report times in minutes rather than seconds. The current version of the index is derived from the NetWitness 9.0 index, but has evolved considerably in order to meet performance expectations and to add new features.
Core Database Strengths and Weaknesses
- High sustained insert rates, without needing down time for bulk inserts.
- Decent query performance simultaneous with high insert rates.
- Automatic cleanup and rollover of old data with minimal fragmentation.
- Extremely high number of meta value indices: more than 100 enabled by default on a Concentrator.
- Ability to scale to Petabyte database sizes and Terabyte index sizes within a single node.
- Using meta key-value pairs, it is very flexible for storing arbitrary meta items within a session. Thus a session can be used to represent nearly any kind of data record.
- The query functionality is limited and low level.
- The packet, meta, and session DB schema is fixed, and all customization is done through custom meta keys and values.
- The database provides no transaction atomicity guarantees as you might expect to find in a SQL database.