|Issue||HBase Master and HBase RegionServer Down Alarms in the RSA Security Analytics Warehouse (SAW) MapR UI.|
The MapR UI shows the following two alarms: (See Figures 1 and 2 below)
HBase Master Down Alarm
Can not determine if service: hbmaster is running. Check logs at: /opt/mapr/hbase/hbase-0.92.2/logs
HBase RegionServer Down Alarm
Can not determine if service: hbregionserver is running. Check logs at: /opt/mapr/hbase/hbase-0.92.2/logs
The HBase Master and HBase RegionServer node services show a failed state in the MapR UI. (See Figure 3 below)
The logs found in the /opt/mapr/hbase/hbase-0.92.2/logs directory on the SAW nodes report errors similar to the following:
2014-06-03 20:44:08,022 ERROR org.apache.zookeeper.client.ZooKeeperSaslClient: An error: (java.security.PrivilegedActionException: javax.security.sasl.SaslException: GSS initiate failed [Caused by GSSException: No valid credentials provided (Mechanism level: Failed to find any Kerberos tgt)]) occurred when evaluating Zookeeper Quorum Member's received SASL token. Zookeeper Client will go to AUTH_FAILED state.
2014-06-03 20:44:08,023 ERROR org.apache.zookeeper.ClientCnxn: SASL authentication with Zookeeper Quorum member failed: javax.security.sasl.SaslException: An error: (java.security.PrivilegedActionException: javax.security.sasl.SaslException: GSS initiate failed [Caused by GSSException: No valid credentials provided (Mechanism level: Failed to find any Kerberos tgt)]) occurred when evaluating Zookeeper Quorum Member's received SASL token. Zookeeper Client will go to AUTH_FAILED state.
2014-06-03 20:44:08,032 WARN org.apache.hadoop.hbase.zookeeper.ZKUtil: regionserver:60020-0x45e281efb65d20 Unable to set watcher on znode /hbase/master
org.apache.zookeeper.KeeperException$AuthFailedException: KeeperErrorCode = AuthFailed for /hbase/master
Follow the steps below to identify and resolve the issue. WARNING! This process will result in approximately 15-30 minutes of downtime on the SAW cluster.
- Connect to each SAW node via SSH.
- On each node, issue the command rpm -qa | grep mapr-hbase to identify which node (or nodes) still has HBase packages installed.
- On each node where HBase packages were found, issue the following command to remove the packages: yum remove mapr-hbase*
- After confirming that the HBase packages have been removed from all nodes, perform the steps below on all nodes to configure MapR to sense that HBase is no longer available. CAUTION! Perform each instruction on all nodes before moving to the next step.
- On each node, stop the warden service (which may take some time to complete) with the following command: service mapr-warden stop
- Stop the zookeeper service on each node with the following command: service mapr-zookeeper stop
- Execute the configure.sh script to configure MapR on each node, issuing the following comand: /opt/mapr/server/configure.sh -R
- Start the zookeeper service on each node with the following command: service mapr-zookeeper start
- Confirm that the Zookeeper Quorum is set up by issuing the command service mapr-zookeeper qstatus on each node. Each node should give a status of "Leader" or "Follower" when running the command.
- On each node, start the warden service withthe following command: service mapr-warden start
Following the steps above will configure MapR to sense that HBase is no longer available and will subsequently stop the alarms from being raised. However, if the requisite downtime to perform the procedure is not acceptable, an alternative solution would be to disable the HBase-related alarms in the MapR UI. To do so, follow the steps below.
- Log into the MapR control system UI.
- Navigate to Alarms -> Alerts using the Navigation tree on the left-hand side of the window.
- Uncheck the HBase and HBase Master-related alarms and apply the changes.
If you are unsure of any of the steps above or experience any issues, contact RSA Support and quote this article ID for further assistance.