|Applies To||RSA Product Set: Security Management|
RSA Product/Service Type: Vulnerability Risk Manager
RSA Version/Condition: 1.1 SP1
Platform: Cent OS
|Issue||The following error may be presented to the user on the VRM cluster when attempting to start or stop MapR services:|
"line 93: /opt/mapr/logs/warden.log: No space left on device"
The following alarm may be triggered on the cluster:
Installation Directory Full
/opt/mapr on the node is running out of space (95% full).
The following result of the "df -h" command indicates 100% utilization of /opt:
|Cause||There is a known defect with VRM prior to versions 1.2 where the /opt/mapr/hadoop/hadoop-0.20.2/pids will continue to grow in size until the /opt volume is out of drive space.|
Additionally, if there are any MapR services that are failing (Status 4), for any reason, they will fill their logging directories with detailed logs of their failure and these log files will also grow in size until the /opt volume is out of drive space.
Additionally, if there is a lot of data throughput on the nodes then the TaskTracker logs generated during normal operation of the cluster can grow to be very large, which can fill up the /opt partition since the log retention of MapR is based on days instead of log file size.
(As required) Change log.retention.time of MapR warehouse
- Follow workaround below to force maintenance jobs to clean up the folders in this partition so that an upgrade can be completed
Upgrade to Vulnerability Risk Management 1.2.
- Run the command: ls -al /opt/mapr/hadoop/hadoop-0.20.2/logs
- Identify whether regular cluster throughput is causing any TaskTracker log files to be above 70 Megabytes per day (70000000). If so, then complete "Update log.retention.time below".
This needs to be completed only if regular MapR throughput is generating log files large enough to fill up the /opt partition. (See above)
Stop MapR services on the cluster node
service mapr-zookeeper stop
service mapr-warden stop
Move to the MapR configuration folder
Copy the existing configuration file to another file named after the current date
cp ./warden.conf ./warden08242016.conf
Edit the warden configuration file
Append log.retention.time entry (3 days in milliseconds) to the end of the warden.conf file
Start MapR services on the cluster node
service mapr-zookeeper start
service mapr-warden start
Repeat steps 1-6 for each node in the cluster.
|Workaround||Please follow the following steps to delete logs older than 7 days and previous Process IDs (PIDS) of MapR services older than 4 days:|
- Log into a cluster node as root user
- Confirm low /opt disk space with the "df" command.
- service mapr-zookeeper stop
- service mapr-warden stop
- find /opt/mapr/hadoop/hadoop-0.20.2/pids/* -mtime +7 -print -delete
- find /opt/mapr/hadoop/hadoop-0.20.2/logs/hadoop-mapr* -mtime +4 -print -delete
- find /opt/mapr/logs/* -mtime +4 -print -delete
- find /opt/mapr/hbase/hbase-0.94.13/logs/*log* -mtime +4 -print -delete
- service mapr-zookeeper start
- service mapr-warden start
- (wait 2 minutes)
- maprcli service list -node NODE_NAME
- Verify no MapR services have a status of 4 before proceeding. If any services are status 4 (FAILED) then these will need to be investigated for other problems.
- Repeat steps 1-13 for each cluster node that is low on drive space on the /opt partition.
- maprcli alarm clearall (To clear all old alarms)
Note: NODE_NAME above in Step#12 is the name of the server.
Note: If an upgrade to VRM 1.2 is not completed, then these commands will need to be rerun on each cluster node, as needed, to ensure /opt does fill up.