Sorry if this is a stupid question but going ask anyway! Today we ran into an issue where an ODBC trace file got corrupt, we didnt really know until i just so happened to be poking around /var/log/messages on another issue and saw this bad boy:
Feb 1 23:38:04 HYBCOLL-1 NwLogCollector: [OdbcCollection] [failure] [mssql.DBNAME2017] [processing] [DBNAME2017] [processing] Data query failed; dataQuery: exec nic_aud_swap_trace 30, 'K:\TraceFiles\blahblahFile', 1, 'WHERE StartTime > 2017-02-01 10:37:42.080', exception Unable to execute statement: Statement: "exec nic_aud_swap_trace 30, 'K:\TraceFiles\blahblahFile', 1, 'WHERE StartTime > 2017-02-01 10:37:42.080'"; Reason: state: S1000; error-code: 139964394242615; description: [RSA][ODBC SQL Server Wire Protocol driver][Microsoft SQL Server]File 'K:\TraceFiles\blahblahblah-24.trc' either does not exist or is not a recognizable trace file. Or there was an error opening the file.
Was like shit, that's not good. We actually depend on audit reports from ODBC trace files so we fixed it following:
Anywayyy, the question to everyone is would this type of error be possible to alert on from health & wellness? Does health & wellness provide us the ability to alert on a keyword or string from say /var/log/messages? Just thinking about system level monitoring options and wonder what we can do out of the box versus not. Also curious if people in the community are using external monitoring tools (like Nagios) to watch over the NW mother ship.
Our NW is at 10.6.1