AnsweredAssumed Answered

NetWitness Python REST Timeout Issues when selecting certain Metadata?

Question asked by PacketRat on Dec 10, 2012
Latest reply on Nov 20, 2016 by Kenny Kim

Been fighting Gremlins with "408 Request Timeout" when accessing the NetWitness  REST API from Python 2.7.  Recently had an epiphany that explains the variability of the observed behavior, but doesn't make sense.  Hoping for some insight/ideas.

 

The python code first runs a query to gather the SessionIDs for activity matching the Search terms.  Then depending on options selected, will  iterate through the Sessionid list to gather all of the related Metadata, NWD Session files, PCAPs, and/or File Artifacts.

 

An example log:

 

(a) 2012-Dec-10 01:39:11 [main:request 121210000024]  Maximum returned session threshold has been overridden, new threshold: 10000

(a) 2012-Dec-10 01:39:11 [main]  Output format field xml set to: 1

(a) 2012-Dec-10 01:39:11 [main:request 121210000024]  Output format field pcap set to: 0

(a) 2012-Dec-10 01:39:11 [main:request 121210000024]  Output format field nwd set to: 1

(a) 2012-Dec-10 01:39:11 [main:request 121210000024]  Output format field artifacts set to: 1

(a) 2012-Dec-10 01:39:11 [main:request 121210000024]  Searchmeta field set to: alias.host

(a) 2012-Dec-10 01:39:11 [main:request 121210000024]  Searchbool field set to: =

(a) 2012-Dec-10 01:39:11 [main:request 121210000024]  Searchdata field set to: 'badguys.com'

(a) 2012-Dec-10 01:39:11 [main:request 121210000024]  Meta field set to: action,alias.host,alias.ip,ip.src,ip.dst

(a) 2012-Dec-10 01:39:11 [main:request 121210000024]  Start date field set to: 2012-Sep-08

(a) 2012-Dec-10 01:39:11 [main:request 121210000024]  End date field set to: 2012-Dec-09

(a) 2012-Dec-10 01:39:11 [main:request 121210000024]  Start time field set to: 00:00:00

(a) 2012-Dec-10 01:39:11 [main:request 121210000024]  End time field set to: 23:59:59

2012-Dec-10 01:39:11 [main:request 121210000024]  New request started

(F) 2012-Dec-10 01:39:11 [runqueries:request 121210000024]  alias.host%20=%20'badguys.com'Search Result

2012-Dec-10 01:39:11 [sessionsquery:request 121210000024]  Sessionid query started /sdk?msg=query&query=select+sessionid+where+alias.host%20=%20'badguys.com'%26%26time=%272012-Sep-08%2000:00:00%27-%272012-Dec-09%2023:59:59%27&size=1000&id1=0&id2=0

2012-Dec-10 01:39:13 [sessionsquery:request 121210000024]  Sessionid query completed /sdk?msg=query&query=select+sessionid+where+alias.host%20=%20'badguys.com'%26%26time=%272012-Sep-08%2000:00:00%27-%272012-Dec-09%2023:59:59%27&size=1000&id1=0&id2=0

2012-Dec-10 01:39:13 [xmlquery:request 121210000024]  XML metadata selection query started /sdk?msg=query&query=select+sessionid,filetype,extension,filename,action,alias.host,alias.ip,ip.src,ip.dst+where+alias.host%20=%20'badguys.com'%26%26time=%272012-Sep-08%2000:00:00%27-%272012-Dec-09%2023:59:59%27&size=1000&id1=0&id2=0

2012-Dec-10 01:39:16 [xmlquery:request 121210000024]  XML metadata selection query completed

2012-Dec-10 01:39:16 [artifactsquery:request 121210000024]  File artifacts query started

2012-Dec-10 01:39:16 [artifactsquery:request 121210000024]  File artifact query completed

2012-Dec-10 01:39:16 [zipresults:request 121210000024]  Zipping temp folder

2012-Dec-10 01:39:16 [zipresults:request 121210000024]  Finished zipping temp folder

2012-Dec-10 01:39:16 [zipresults:request 121210000024]  Request completed

2012-Dec-10 01:39:16 [zipresults:request 121210000024]  Request processing totaled 5.2389998436 seconds

 

 

While each option presents it's own "Request Timeout " challenges, this post focuses on apparent Metadata field selection sensitivity.


 

Problem Summary

 

408 Request Timeouts  appear randomly, without any obvious relation to:

 

  • Number of Search Terms (1 or a 1,000)
  • Period (1 Day or 120 Days)
  • The number of matching Sessions (setting 'size' = 1000 or 100  eliminates this factor for this experiment)
  • Number of Metadata fields returned

 

The first Eureka was when things started working very reliably after inadvertently removing 'time' from one of the standard Selection Clauses:

 

From:     Select   =  'time,action,alias.host,alias.ip,ip.src,ip.dst'    

To:          Select   =  'action,alias.host,alias.ip,ip.src,ip.dst' 


Initial testing revealed that other metadata selection terms have similar impacts ( and possibly the combination and/or ordering of terms?).

 

Including Session Time in our XML Metadata Extract processes is critical to almost every Use Case, trumped only by reliabilioty.

 

 

Any insights, thoughts, stratgeies, debugging tips, etc. are sincerely appreciated.  Also interested in any general discussions, techniques, code samples, etc   on dealing with NW REST API Session Timeouts in generall, in Use Cases where queries can return very large data sets,  and/or when applying complex search terms to extended periods containing high volumes of traffic (Python 2.7/3.2 context very helpful)

Outcomes