000034250 - Warehouse data retention for RSA Security Analytics

Document created by RSA Customer Support Employee on Nov 2, 2016Last modified by RSA Customer Support on Apr 24, 2019
Version 4Show Document
  • View in full screen mode

Article Content

Article Number000034250
Applies ToRSA Product Set: Security Analytics
RSA Product/Service Type: Security Analytics Warehouse
RSA Version/Condition: 10.6.x
Platform: CentOS
O/S Version: 6
 
TasksUse the following steps to set up data retention on the RSA Security Analytics Warehouse node cluster.
ResolutionWe recommend using a custom script to remove the avero files, since MapR (or any Hadoop distribution) does not provide any features to adjust the maximum retention. 

The script below can be used and scheduled on one of the nodes in the cluster to run on a daily/weekly basis : 

# cat /opt/mapr/server/retention.sh



#!/bin/bash
usage="Usage: avro_retention.sh [days]"
if [ ! "$1" ]
then
  echo $usage
  exit 1
fi
now=$(date +%s)
hadoop fs -lsr /rsasoc | grep -E "avro$" |grep -v "meta" | while read f; do
  dir_date=`echo $f | awk '{print $6}'`
  difference=$(( ( $now - $(date -d "$dir_date" +%s) ) / (24 * 60 * 60 ) ))
  if [ $difference -gt $1 ]; then
       echo $f;
        result=$(hadoop dfs -rm $(echo $f | awk -F " "  '{printf $8}')) #>> /opt/mapr/logs/retention.log
        result="`date` $result"
        echo $result >> /opt/mapr/logs/retention.log

  fi
done


For example:
The cronjob below will run data retention cleanup every Thursday at 5:15 pm to clean data older than 180 days:

# crontab –e is : 
15 17 * * 4 sh /opt/mapr/server/retention.sh 180


If there are any questions regarding the above script, please contact RSA Netwitness Technical Support

Attachments

    Outcomes