Here is a little quick guide to help users identify parsing issues within their SA instance.Due to the static text parsing SA leverages within their parsers, you may want to do some scheduled reporting/querying to identify events that aren't identified properly, as new/rarely seen events that come through may not be properly identified.
Detecting unidentified events within you SA Instance.
Description : This will identify events that either A) Your decoders do not have the parser enabled or B) SA doesn't support this log
Query :device.type=unknown
Detecting properly identified events, but not tagging exists
Description : This will identify events that are caught by a parser, but no tagging/normalization exists, as either the log is incorrectly identified by the wrong parser (header parser issues), and/or there is no message parser created for this parser
Query : device.type exists && msg.id !exists
Tip1 :
You may also want to ensure you do run reports from a period of time, and validate that you parsers are cleaned up on your decoders (removing unused parsers) as they will use more processing overhead, and may incorrectly identify logs with the wrong parser.
Tip2 :
You will also want to keep an eye on device types that are outliers (identified parsers within SA that are generating an extremely high/low amount of traffic detected by a parser (device.type)), chances are that you will want to dig into these logs and ensure that the proper parser is being used. Create daily/weekly reports and/or leverage rules within real-time charts on dashboards to keep on eye on your log sources.
Tip3:
Rhlinux is usually a catch-all, and identifies traffic from a bunch of logs that do not have a message parser created for this (see example 2 query above to catch this). I have seen some cisco parsers also catch a bunch of logs that arent properly identified either.
Also, just found that users should also keep an eye on the meta tag : "parse.error", this will show issues that the parser presents in normalizing the event.