I've created the attached script, which I believe might be useful in certain circumstances. Mostly what it accomplishes can also be done by the Reporting and Alerting capabilities in SA/NextGen.
However, by generating CSV to the Standard Output it can easily be use to integrate with other tools in an automated form if necessary. It also has a key feature (in my view) missing in the reporting engine, it will be able to count and sort by count reports with more than one key, for example the top source-destination IP pairs.
It requires Python 2.7.x and has been tested on *nix and CygWin. It will display the following help/usage message when executed with -h or without parameters.
Usage: nwsdk_csv.py [options]
This script will connect via the NW REST API. It will use the 'keys' and
'where' parameters to build the necessary call. It will output the results to
STDOUT as a CSV file.
-h, --help show this help message and exit
-c CONNECT, --connect=CONNECT
[REQUIRED] NextGen REST device URL (e.g:
-w CLAUSE, --where=CLAUSE
Query's 'where' clause
-t TIME, --time=TIME Time window (in seconds from now(), if not used time
defaults to 'All time')
-k FIELDS, --keys=FIELDS
Meta Keys to extract
-u USERNAME, --user=USERNAME
Username for REST endpoint
-p PASSWORD, --pass=PASSWORD
Username for REST endpoint
--no-count Do not display aggregation count
--no-header Do not add header line to output
--dns Resolve IP addresses via DNS
--top=TOP Filter on only Top <TOP> values for first key
-f FILENAME, --file=FILENAME
Filename tracking the latest completed METAID
--gmtime Convert Time from Epoch to GMT
Here are some basic examples of what it can be used for.
Show a count of all communications between the top 3 source addresses in the last 15 minutes, including Source IP address, Destination IP address, Service and the respective count.
# python ./nwsdk_csv.py -c https://broker:50103/ -t 900 --top 3
2013-Aug-29 11:07:02 - INFO: Using SSL, applying TLSv1 fix
2013-Aug-29 11:07:02 - INFO: Getting top 3 values for ip.src
2013-Aug-29 11:07:02 https://broker:50103/sdk?msg=values&size=3&fieldName=ip.src&flags=2305&where=service%20exists%20%26%26%20time%3D%222013-Aug-29%2014%3A52%3A02%22-%222013-Aug-29%2015%3A07%3A02%22&expiry=0&force-content-type=application/json
2013-Aug-29 11:07:02 https://broker:50103/sdk?msg=query&size=250000&id1=0&query=select%20ip.src%2Cservice%2Cip.dst%20where%20service%20exists%20%26%26%20time%3D%222013-Aug-29%2014%3A52%3A02%22-%222013-Aug-29%2015%3A07%3A02%22%20%26%26%20ip.src%3D192.168.134.136%2C192.168.134.134%2C192.168.134.133&flags=1&expiry=0&force-content-type=application/json
2013-Aug-29 11:07:11 More data to process Completed ID:5763124851 Meta ID:5763124861 Last ID:5763125005
2013-Aug-29 11:07:12 All done Completed ID:5763124861 Meta ID:5763124861 Last ID:5763125005
In the above example, the script will first query for the top 3 Source IP addresses and will then use this on the following request as part of the 'where' clause.
The following example, shows one of The specified item was not found.'s recent queries used to extract information on all the relevant sessions.
# python ./nwsdk_csv.py -c https://broker:50103/ -k "sessionid,time,ip.src,ip.dst,service,tcp.dstport,alias.host,client,server,directory,filename,risk.info" -w "alias.host begins update,report && filename='<none>' && directory='/' && query exists && query length 100-u" --gmtime
2013-Aug-30 09:33:38 - INFO: Using SSL, applying TLSv1 fix
2013-Aug-30 09:33:38 https://broker:50103/sdk?msg=query&size=250000&id1=0&query=select%20sessionid%2Ctime%2Cip.src%2Cip.dst%2Cservice%2Ctcp.dstport%2Calias.host%2Cclient%2Cserver%2Cdirectory%2Cfilename%2Crisk.info%20where%20alias.host%20begins%20update%2Creport%20%26%26%20filename%3D%27%3Cnone%3E%27%20%26%26%20directory%3D%27/%27%20%26%26%20query%20exists%20%26%26%20query%20length%20100-u&flags=1&expiry=0&force-content-type=application/json
2013-Aug-30 09:33:43 All done Completed ID:2147212752 Meta ID:2147212752 Last ID:5842214536
44918863,"2013-May-05 20:47:21 GMT",192.168.14.14,188.8.131.52,80,80,updateserver.zillya.com,VPNGuardService,nginx,"/",<none>,"flags_syn|flags_rst|flags_psh|flags_ack|http1.1 without referer header|nginx http server",1
Here's a few more interesting examples, unfortunately for these I can't provide the output.
./nwsdk_csv.py -c https://broker:50103/ -k 'sessionid,time,ip.src,ip.dst,alias.host,service,tcp.dstport,udp.dstport,ip.proto,client,alert,risk.info,risk.warning,risk.suspicious' -w 'alert = my_ioc && alert != ips_of_interest' --no-count --gmtime -f track_new_myioc.lastid > track_new_myioc.output_`date +"%Y-%m-%d_%H%M"`.csv
The above example, will extract several meta keys based on alert criteria that is part of existing content on NextGen Decoders, it will save that output to a date coded file name but it will keep track of the last position on the NWDB that it queried up to in the file "track_new_myioc.lastid", it will use this same file in subsequent runs (as long as passed as a parameter) to only get new data from that position forward, this is a similar process to that used by the Alerting Engine.
# for d in `cat bad_domains`; do f="alias_host_$d.csv" ; d="'$d'"; python nwsdk_csv.py -c https://broker:50103/ -k alias.host,ip.src,ip.dst,service -w "alias.host contains $d" > $f; done
The above example will iterate through a list of domains contained on the "bad_domains" file and will provide an output per domain of the IP source, IP destination and Service for each. This can be handy for data that was collected before certain feeds where deployed.
I'm sure you will find several other examples and please shared them if you do! Please feel free to provide any feedback or ask any questions.
PS: I don't claim any programmer skills so feel free to re-use or modify this code.
PPS: It's been brought to my attention that if you upgrade to SA 10.6.2.2 due to its use of TLS 1.2 running this script with the native OS python 2.6.6 version will no longer work. Please reach out to me directly if you really need a "hack" around it.