NetWitness NextGen RESTful Python Test App
As promised at the NetWitness User's conference, here is the python program I wrote to demonstrate how to query NextGen devices over REST. This script was written in python 3.2.3, but we will be posting a backport of it for python 2.7. The backport was done by Dave Penner.
I hope everyone enjoys it. Post any questions or comments in this thread and feel free to enhance it and post it back to the community.
NextGen Lead Engineer
- Community Thread
- Forum Thread
- RSA NetWitness
- RSA NetWitness Platform
Let me give you a little background on what the tool is showing you in case this is your first time playing with NextGen. If you connect to a concentrator you should see a number of items in the "Pathname" drop list. You can think of everything as a directory structure under "/" on the given service. If you choose "/" as the path and "ls" as the "Message" then you get back the nodes (or directories) that are contained under root. So if the path is "/concentrator/config" you get all the nodes under that path.
Most of the folders have both a "stats" and a "config" folder. The stats are real tim information updated as the system is running and the config sets up system behavior. To find out a folder's children send the "ls" message to the node ("Submit" button). To find out what messages a node supports send the "help" message to the node. To find out details about a specific message send the "help" message and add the parameter "msg=ls" (or click the "Get Message Help button"). These basics concepts will allow you to completely explore the node tree. To get started navigating the summary below, send the "ls" command to the "/" path. For now make sure your "Accept" option is "text/plain" (I'll talk more about this later). My examples will be in URL format that show what the request would look like in a browser. The tool breaks the parts of the url down into understandable concepts but once you get the concept of how to navigate using the tool, a browser is just as easy to use.
The following summarizes what each node does.
concentrator - This contains core concentrator information for the aggregation process. The stats show you available resources during aggregation as well as current aggregation rates. The configuration specifies how aggregation operates for memory and performance as well as controlling the devices that the concentrator is connecting to. Sending messages to this node will allow you start/stop aggregation (/concentrator?msg=start).
connections - This contains information about the clients that are connecting to concentrator. You will rarely if ever need information under this node.
database - This contains stats and configuration about the storage of data on the concentrator. The stats show information about session and meta stored in the databases as well as the write writes. The configuration specifies storage location, rollover constraints, and other storage settings.
index - This contains stats and configuration about the indexing process. The stats show information about the data ranges being used by the index as well as page and size information. The configuration specifies the storage location as well as the page compression algorithm to use. Note, while the stats are interesting this is NOT the place to go to query data. This node should be used for troubleshooting purposes only.
logs - This contains stats and configuration for the logging system. The stats contain the number of logs being stored. The configuration specifies the log level to store we well as an SNMP agent to send traps. Sending messages to this node allow you to download log messages (/logs?msg=download&id1=100&id2=200).
rest - This contains stats and configuration of the REST API service. The configuration specifies the location of the cache directory and the port to run on. The service can also be turned off from here.
services - This contains connections to other services. You will never need to manipulate this folder.
sys - This contains stats and configuration for the service. The stats show you cpu and memory utilization, service status, version, and licensing information. The configuration specifies the port it listens on, ssl usage, and historical statistic storage. Sending messages to this node allow you to manage licenses, save configuration, or shutdown the service (/sys?msg=shutdown&reason=Monthly Maintenance).
users - This contains stats and configuration for users, groups, and roles. This section is best left to a user interface but you can poke around to see what is there.
sdk - I intentionally left this node for last since it is the most important for extracting information from the system and thus requires the most explanation. The stats show you information about the active queries being run. The configuration helps control how queries are run. The meat of this node resides in its messages. The following summarizes the important messages for obtaining data over REST.
language - This message will return a list of all the meta data fields that are currently configured in the index. Do not forget the required parameters id1, id2, and size (/sdk?language&id1=0&id2=0&size=100).
aliases - This message will return a set of value aliases that can be presented to a user. It will map a value inside meta for a given key into a user friendly value (/sdk?msg=aliases&key=service).
summary - This message gives you a general overview of the data in the system such as the first and last session and meta. This is a good place to start to get the active id ranges for data extraction.
session - This message returns the meta ID range for a given session ID range. This is important since the query and values messages require a meta range not a session range. If you use zero for id1 or id2 then the earliest and latest Ids will be returned (/sdk?msg=session&id1=0&id2=0).
values - This message will give you the top N values for a given meta field. You could take the results from the language call and make a values call for each language result to get a summary of each meta key. You can also provide a where clause to narrow the results (/sdk?msg=values&id1=1&id2=100000&size=10&fieldName=service&where=username%3Dtim). A very common procedure is to use the "time" meta key in the where clause to bound the query to last 24 hours or some other timeframe.
query - This message will give you meta data that match the given query. The query clause is used to control what information is returned. It is pseudo SQL like (select time,username where service=80). A very common procedure is to use the "time" meta key in the where clause to bound the query to last 24 hours or some other timeframe.
content - This message was not intended to be used directly with REST. Once you have a handle on the above concepts go to a web browser and use the URL (/sdk/content) to explore how to get packet and log data in a reconstructed fashion. Use the help page to play with the various URL parameters.
The tree structure is almost identical for a decoder. The one difference is that the "concentrator" folder has been replaced by a "decoder" folder. I will leave it up to you as homework to check out what things are available under that folder. Also, experiment with the "Accept" drop list to see what other formats the data can be returned in.
A lot of this functionality is available at the root of your appliance's REST website. You open a link like: hxxp://<mybroker fqdn>:50103 or hxxp://<myconcentrator>:50105 such as http://notarealbroker.netwitness.net:50103/ then click on the asterisk to the right of the "sdk" listing. Then you can choose an available method from the SDK drop-down menu, put mandatory or optional parameters in the paramaters field, separated by spaces, then submit your query. It will show you the related URI and the output as plain text, unless you override it with "force-content-type=" one of these types "text/html", "application/json", or "application/octet-stream".
NOTE: If you are using IE, you have to always use "?force-content-type=text/html" to even see this option, as IE tends to default to XML display of REST. Example: http://notarealbroker.netwitness.net:50103/?force-content-type=text/html
You can click the asterisk to see any of the other functionality on indexes, databases, and system properties, too.
This didn't work with SSL for Python 2.7.
Added snippet just below other imports and it worked.
----------------- snip ---------------------------
orig_ssl_wrap = ssl.wrap_socket
def my_ssl_wrap( socket, keyfile=None, certfile=None, server_side=False, cert_reqs=0, ssl_version=2, ca_certs=None, do_handshake_on_connect=True, suppress_ragged_eofs=True, ciphers=None 😞
ssl_version = ssl.PROTOCOL_TLSv1
return orig_ssl_wrap( socket, keyfile, certfile, server_side, cert_reqs, ssl_version, ca_certs, do_handshake_on_connect, suppress_ragged_eofs, ciphers )
ssl.wrap_socket = my_ssl_wrap
----------------- snip ---------------------------