000014406 - Appliance fails to join because of slow network switch negotiation

Document created by RSA Customer Support Employee on Jun 14, 2016Last modified by RSA Customer Support Employee on Apr 21, 2017
Version 2Show Document
  • View in full screen mode

Article Content

Article Number000014406
Applies ToRSA Data Protection Manager Appliance 3.1.2
IssueAppliance fails to join because of slow network switch negotiation
DPM Appliance fails to join another appliance, and the following error shows in log file /opt/appliance/logs/rkma-system.log during join operation:

2012-02-28 11:08:54,323 ERROR - com.rsa.appliance.sys.service.impl.NewSetupApplianceServiceImpl.processErrorsIfAny(NewSetupApplianceServiceImpl.java:462) : Exception occurred: Error on copy from remote box:Copying file /version.txt from 10.10.17.55...
Copy ... Failed.False
Remote server is unreachable.
2012-02-28 11:08:54,323 ERROR - com.rsa.appliance.sys.service.impl.NewSetupApplianceServiceImpl.processErrorsIfAny(NewSetupApplianceServiceImpl.java:483) : Could not connect to the provided remote IP 10.10.17.55 as QUSER.
2012-02-28 11:08:54,324 ERROR - com.rsa.appliance.sys.service.impl.NewSetupApplianceServiceImpl.processErrorsIfAny(NewSetupApplianceServiceImpl.java:486) : Exception occurred: Error while trying to connect to the remote host:Copying file /version.txt from 10.10.17.55...
Copy ... Failed.False
Remote server is unreachable.
2012-02-28 11:08:54,325 ERROR - error.setup.software.configuration.failed
com.rsa.appliance.exception.BusinessServiceException
 at com.rsa.appliance.sys.service.impl.NewSetupApplianceServiceImpl.processErrorsIfAny(NewSetupApplianceServiceImpl.java:493)
 at com.rsa.appliance.sys.service.impl.NewSetupApplianceServiceImpl.validateHostAndPwdAndCopyCertificates(NewSetupApplianceServiceImpl.java:405)
 at com.rsa.appliance.sys.service.impl.NewSetupApplianceServiceImpl.validateClusterJoinReadiness(NewSetupApplianceServiceImpl.java:334)
 at com.rsa.appliance.sys.service.impl.NewSetupApplianceServiceImpl.setupAppliance(NewSetupApplianceServiceImpl.java:162)
 at com.rsa.appliance.sys.scheduler.QuickSetupJob.executeJob(QuickSetupJob.java:66)
 at com.rsa.appliance.sys.taskmanagement.BaseJob.execute(BaseJob.java:167)
 at org.quartz.core.JobRunShell.run(JobRunShell.java:202)
 at org.quartz.simpl.SimpleThreadPool$WorkerThread.run(SimpleThreadPool.java:534)
ResolutionThis issue has been fixed in the next release DPM Appliance 3.2 (not released as of writing this article).
As a workaround, for DPM Appliance 3.1.2, update the script /opt/rsa/setup/sh/copy_functions.sh (as shown in red in the following excerpt) to add "ping -c 30" [this will generate 30 ping requests ensuring that the switch receives enough packets to generate its routing table] BEFORE joining operation:

function copyDummyFileFromRemoteServer()
{
username=quser
password=$2
remoteServer=$1
file=/version.txt
copy_dir=/opt/rsa/setup/work
        mkdir -p /opt/rsa/setup/work
        rm -f /opt/rsa/setup/work/version.txt
        rm -f /root/.ssh/known_hosts
        ### KMA-2623 ###
        echo "Pinging host $remoteServer for 30 counts to allow slow switch port negotiation to occur"
        ping -c30 $remoteServer
        ################
        echo "Copying file $file from $remoteServer... "
        COPY_STATUS=`python /opt/rsa/setup/py/GetFileFromRemoteServerNew.py $username $password $remoteServer $file $copy_dir`
        retval=$?
        if [ $retval != 0 ]
        then
                echo "Copy ... Failed.$COPY_STATUS"
                return $retval
        else
                if [ ! -f /opt/rsa/setup/work/version.txt ]; then
                        echo "Could not copy the file"
                        return 1
                fi
                echo "Copy Done"
                return 0
        fi
}
Legacy Article IDa58284

Attachments

    Outcomes