Rui Ataide

REST API to CSV

Discussion created by Rui Ataide Employee on Aug 30, 2013
Latest reply on Nov 24, 2016 by Rui Ataide

Hi all,

 

I've created the attached script, which I believe might be useful in certain circumstances. Mostly what it accomplishes can also be done by the Reporting and Alerting capabilities in SA/NextGen.

 

However, by generating CSV to the Standard Output it can easily be use to integrate with other tools in an automated form if necessary. It also has a key feature (in my view) missing in the reporting engine, it will be able to count and sort by count reports with more than one key, for example the top source-destination IP pairs.

 

It requires Python 2.7.x and has been tested on *nix and CygWin. It will display the following help/usage message when executed with -h or without parameters.

Usage: nwsdk_csv.py [options]

 

 

This script will connect via the NW REST API. It will use the 'keys' and

'where' parameters to build the necessary call. It will output the results to

STDOUT as a CSV file.

 

 

Options:

  -h, --help            show this help message and exit

-c CONNECT, --connect=CONNECT

                        [REQUIRED] NextGen REST device URL (e.g:

                        http://nwbroker:50103/ or https://nwcon:50105/)

  -w CLAUSE, --where=CLAUSE

                        Query's 'where' clause

  -t TIME, --time=TIME  Time window (in seconds from now(), if not used time

                        defaults to 'All time')

  -k FIELDS, --keys=FIELDS

                        Meta Keys to extract

  -u USERNAME, --user=USERNAME

                        Username for REST endpoint

  -p PASSWORD, --pass=PASSWORD

                        Username for REST endpoint

  --no-count            Do not display aggregation count

  --no-header           Do not add header line to output

  --dns                 Resolve IP addresses via DNS

  --top=TOP             Filter on only Top <TOP> values for first key

  -f FILENAME, --file=FILENAME

                        Filename tracking the latest completed METAID

  --gmtime              Convert Time from Epoch to GMT

 

 

Here are some basic examples of what it can be used for.

 

Show a count of all communications between the top 3 source addresses in the last 15 minutes, including Source IP address, Destination IP address, Service and the respective count.

# python ./nwsdk_csv.py -c https://broker:50103/ -t 900 --top 3

2013-Aug-29 11:07:02 - INFO: Using SSL, applying TLSv1 fix

2013-Aug-29 11:07:02 - INFO: Getting top 3 values for ip.src

2013-Aug-29 11:07:02 https://broker:50103/sdk?msg=values&size=3&fieldName=ip.src&flags=2305&where=service%20exists%20%26%26%20time%3D%222013-Aug-29%2014%3A52%3A02%22-%222013-Aug-29%2015%3A07%3A02%22&expiry=0&force-content-type=application/json

2013-Aug-29 11:07:02 https://broker:50103/sdk?msg=query&size=250000&id1=0&query=select%20ip.src%2Cservice%2Cip.dst%20where%20service%20exists%20%26%26%20time%3D%222013-Aug-29%2014%3A52%3A02%22-%222013-Aug-29%2015%3A07%3A02%22%20%26%26%20ip.src%3D192.168.134.136%2C192.168.134.134%2C192.168.134.133&flags=1&expiry=0&force-content-type=application/json

2013-Aug-29 11:07:11 More data to process Completed ID:5763124851 Meta ID:5763124861 Last ID:5763125005

2013-Aug-29 11:07:12 All done Completed ID:5763124861 Meta ID:5763124861 Last ID:5763125005

ip.src,service,ip.dst,count

192.168.14.14,0,192.168.14.26,2872

192.168.14.13,0,192.168.14.27,2851

192.168.14.16,0,192.168.14.26,2699

192.168.14.14,0,192.168.14.14,1826

192.168.14.13,0,192.168.14.14,1820

192.168.14.16,0,192.168.14.14,1771

192.168.14.16,53,192.168.14.4,1112

192.168.14.14,80,192.168.14.11,112

192.168.14.13,80,192.168.14.11,110

192.168.14.16,80,192.168.14.11,103

192.168.14.16,0,192.168.14.11,35

192.168.14.14,0,192.168.14.11,15

192.168.14.13,0,192.168.14.11,12

192.168.14.16,0,192.168.14.4,2

 

In the above example, the script will first query for the top 3 Source IP addresses and will then use this on the following request as part of the 'where' clause.

 

The following example, shows one of The specified item was not found.'s recent queries used to extract information on all the relevant sessions.

 

# python ./nwsdk_csv.py -c https://broker:50103/ -k "sessionid,time,ip.src,ip.dst,service,tcp.dstport,alias.host,client,server,directory,filename,risk.info" -w "alias.host begins update,report && filename='<none>' && directory='/' && query exists && query length 100-u" --gmtime

2013-Aug-30 09:33:38 - INFO: Using SSL, applying TLSv1 fix

2013-Aug-30 09:33:38 https://broker:50103/sdk?msg=query&size=250000&id1=0&query=select%20sessionid%2Ctime%2Cip.src%2Cip.dst%2Cservice%2Ctcp.dstport%2Calias.host%2Cclient%2Cserver%2Cdirectory%2Cfilename%2Crisk.info%20where%20alias.host%20begins%20update%2Creport%20%26%26%20filename%3D%27%3Cnone%3E%27%20%26%26%20directory%3D%27/%27%20%26%26%20query%20exists%20%26%26%20query%20length%20100-u&flags=1&expiry=0&force-content-type=application/json

2013-Aug-30 09:33:43 All done Completed ID:2147212752 Meta ID:2147212752 Last ID:5842214536

sessionid,time,ip.src,ip.dst,service,tcp.dstport,alias.host,client,server,directory,filename,risk.info,count

44918863,"2013-May-05 20:47:21 GMT",192.168.14.14,213.133.99.140,80,80,updateserver.zillya.com,VPNGuardService,nginx,"/",<none>,"flags_syn|flags_rst|flags_psh|flags_ack|http1.1 without referer header|nginx http server",1

 

Here's a few more interesting examples, unfortunately for these I can't provide the output.

./nwsdk_csv.py -c https://broker:50103/ -k 'sessionid,time,ip.src,ip.dst,alias.host,service,tcp.dstport,udp.dstport,ip.proto,client,alert,risk.info,risk.warning,risk.suspicious' -w 'alert = my_ioc && alert != ips_of_interest' --no-count --gmtime -f track_new_myioc.lastid > track_new_myioc.output_`date +"%Y-%m-%d_%H%M"`.csv

 

The above example, will extract several meta keys based on alert criteria that is part of existing content on NextGen Decoders, it will save that output to a date coded file name but it will keep track of the last position on the NWDB that it queried up to in the file "track_new_myioc.lastid", it will use this same file in subsequent runs (as long as passed as a parameter) to only get new data from that position forward, this is a similar process to that used by the Alerting Engine.

 

# for d in `cat bad_domains`; do f="alias_host_$d.csv" ; d="'$d'"; python nwsdk_csv.py -c https://broker:50103/ -k alias.host,ip.src,ip.dst,service -w "alias.host contains $d" > $f; done

 

The above example will iterate through a list of domains contained on the "bad_domains" file and will provide an output per domain of the IP source, IP destination and Service for each. This can be handy for data that was collected before certain feeds where deployed.

 

I'm sure you will find several other examples and please shared them if you do! Please feel free to provide any feedback or ask any questions.

 

Thank you,

 

Rui

PS: I don't claim any programmer skills so feel free to re-use or modify this code.

PPS: It's been brought to my attention that if you upgrade to SA 10.6.2.2 due to its use of TLS 1.2 running this script with the native OS python 2.6.6 version will no longer work. Please reach out to me directly if you really need a "hack" around it.

Attachments

Outcomes