000034250 - Warehouse data retention for Netwitness Logs & Packets

Document created by RSA Customer Support Employee on Nov 2, 2016Last modified by RSA Customer Support Employee on Apr 21, 2017
Version 3Show Document
  • View in full screen mode

Article Content

Article Number000034250
Applies ToRSA Product Set: Security Analytics
RSA Product/Service Type: SA Warehouse
IssueThis explains how to set up data retention on the Warehouse node cluster.
ResolutionWe recommend using a custom script to remove the avero files since MapR (or any Hadoop distribution) does not provide any features to adjust the maximum retention. 
The script below can be used and scheduled on one of the nodes in cluster to run on daily/weekly basis : 
cat /opt/mapr/server/retention.sh
usage="Usage: avro_retention.sh [days]"
if [ ! "$1" ]
  echo $usage
  exit 1
now=$(date +%s)
hadoop fs -lsr /rsasoc | grep -E "avro$" |grep -v "meta" | while read f; do
  dir_date=`echo $f | awk '{print $6}'`
  difference=$(( ( $now - $(date -d "$dir_date" +%s) ) / (24 * 60 * 60 ) ))
  if [ $difference -gt $1 ]; then
       echo $f;
        result=$(hadoop dfs -rm $(echo $f | awk -F " "  '{printf $8}')) #>> /opt/mapr/logs/retention.log
        result="`date` $result"
        echo $result >> /opt/mapr/logs/retention.log

For example:
The cronjob below will run data retention cleanup every Thursday at 5:15pm to clean data older than 180 days:
#crontab –e is : 
15 17 * * 4 sh /opt/mapr/server/retention.sh 180

If there are any questions regarding the above script, please contact RSA Netwitness Technical Support