This topic lists and describes the available configuration parameters for RSA NetWitness Suite Log Decoders.
Log Decoder Configuration Settings
This table lists and describes the Log Decoder configuration settings.
|Log Decoder Setting Field||Description|
|Database||/database/config refer to the Database Configuration Nodes topic in the NetWitness Suite Core Database Tuning Guide.|
|Decoder||/decoder/config refer to Decoder and Log Decoder Configuration Parameters|
|Index||/index/config refer to the Index Configuration Nodes topic in the NetWitness Suite Core Database Tuning Guide.|
|Logs||/logs/config refer to Core Service Logging Configuration.|
|REST||/rest/config refer to REST Interface Configuration|
|SDK||/sdk/config refer to the SDK Configuration Nodes topic in the NetWitness Suite Core Database Tuning Guideand Core Service system.role Modes.|
|System||/sys/config refer to Core Service System Configuration.|
Log Tokenizer Configuration Settings
The log decoder has a set of configuration items that control how the automatic log tokenizer creates meta items from unparsed logs. The log tokenizer is implemented as a set of built-in parsers that each scan for a subset of recognizable tokens. The functionality of each of these native parsers is shown in the table below. These word items form a full-text index when they are fed to the indexing engine on the Concentrator and Archiver. By manipulating the parsers.disabled configuration entry, you can control which Log Tokenizers are enabled.
|Parser Name||Description||Configuration Parameters|
|Log Tokens||Scans for runs of consecutive characters to produce 'word' meta items.||token.device.types, token.char.classes, token.max.length, token.min.length, token.unicode|
|IPSCAN||Scans for text that appears to be an IPv4 address to produce 'ip.addr' meta items.||token.device.types|
|IPV6SCAN||Scans for text that appears to be an IPv6 address to produce 'ipv6' meta items.||token.device.types|
|URLSCAN||Scans for text that appears to be a URI to produce 'alias.host', 'filename', 'username', and 'password' meta items.||token.device.types|
|DOMAINSCAN||Scans for text that appears to be a domain name to produce 'alias.host', 'tld', 'cctld', and 'sld' meta items.||token.device.types|
|EMAILSCAN||Scans for text that appears to be an email address to produce 'email' and 'username' meta items.||token.device.types|
|SYSLOGTIMESTAMPSCAN||Scans for text that appears to be syslog-format timestamps. Syslog is missing the year and time zone. When such text is located, it is normalized into UTC time to create 'event.time' meta items.||token.device.types|
|INTERNETTIMESTAMPSCAN||Scans for text that appears to be RFC 3339-format timestamps to create 'event.time' meta items.||token.device.types|
These are the Log Tokenizer configuration parameters.
|Log Decoder Parser Setting Field||Description|
|token.device.types||The set of device types that will be scanned for raw text tokens. By default, this is set to unknown, which means only logs that were not parsed will be scanned for raw text. You can add additional log types here to enrich parsed logs with text token information.|
If this field is empty, then log tokenization is disabled.
|token.char.classes||This field controls the type of tokens that are generated. It can be any combination of the values alpha, digit, space, and punct. The default value is alpha.|
|token.max.length||This field puts a limit on the length of the tokens. The default value is 5 characters. The maximum length setting allows the Log Decoder to limit the space needed to store the word metas. Using longer tokens requires more meta database space, but may provide slightly faster raw text searches. Using shorter tokens causes the text query resolver to have to perform more reads from the raw logs during searches, but it has the effect of using much less space in the metadb and index.|
|token.min.length||This is the minimum length of a searchable text token. The minimum token length will correspond to the minimum number of characters a user may type into the search box in order to locate results. The recommended value is the default, 3.|
|token.unicode||This boolean setting controls whether unicode classification rules are applied when classifying characters according to the token.char.classes setting. If this is set to true, each log is treated as a sequence of UTF-8 encoded code points and then classification is performed after the UTF-8 decoding is performed. If this is set to false, then then each log is treated as ASCII characters and only ASCII character classification is done. Unicode character classification requires more CPU resources on the Log Decoder. If you do not need non-English text indexing, you can disable this setting to reduce CPU utilization on the Log Decoder. The default is enabled.|