000029764 - Authentication failing with F5 Big Iron F5 Load Balancer version 11.5 or 11.6 with no entries in the Authentication Manager authentication activity logs

Document created by RSA Customer Support Employee on Jun 14, 2016Last modified by RSA Customer Support Employee on Apr 21, 2017
Version 3Show Document
  • View in full screen mode

Article Content

Article Number000029764
Applies ToRSA Product Set: Authentication Manager, SecurID
RSA Product/Service Type: Authentication Manager
RSA Version/Condition: 8.1.0, 8.1.1, 8.1 SP1
Platform: VMware
Platform (Other): null
O/S Version: ESXi 5.0
Product Name: null
Product Description: null
IssueIf there is more than a 35 second gap between the user entering their On-Demand Authentication (ODA) credentials and the user entering their PIN, authentication will fail without an error in the authentication activity log.  The /opt/rsa/am/server/logs/imsTrace.log will show the following highlighted errors if verbose logging is enabled.
authmgr.internal.agent.server.DatagramHandlerImpl, DEBUG, rsa8appliance.lab.fp.f5net.com,,,,Failed to process request from client
java.lang.IllegalArgumentException: SessionId cannot be null

per AM-28606 imsTrace log if enabled for verbose will now show:
“SessionID cannot be found for new PIN mode, maybe port is changed to : <New Source UDP port>”
To enable verbose logging,
  1. Login to the Security Console.
  2. Choose Setup > System Settings.  
  3. Under Basic Settings, choose Logging.
  4. Select your primary server and click Next.
  5. Set the Trace Log option to Verbose.
  6. If you have replica server(s), you can optionally choose to apply the above settings to the replica instance(s) upon save by checking that option.  This saves you from having to configure each server individually.
  7. When done, click Save.
Now test authentication again and review the imsTrace.log files on each Authentication Manager server for the error message above.
Note: Both On-Demand Authentication and Next Tokencode Mode require two inputs, the second being a tokencode, and both are affected by this issue
The F5 is doing Network Address Translation (NAT) of the agent as it forwards authentication packets to the Authentication Manager server.
CauseThe F5 (or any NAT router/firewall) has a UDP timeout, which is approximately 34 seconds on version 11.5 and at least 60 or slightly more with version 11.6.  After this timeout the source UDP port is changed, as the NAT views it as another connection.  This causes problems with the RSA authentication session which assumes the source port will remain the same for the Next Tokencode request as it was for the PIN or first tokencode.
Here’s a summary trace on the agent side.  The agent's IP address is with a source port of 59997.  It is sending authentication requests to the Authentication Manager server ( to destination UDP port 5500:
 agent side
In frames 7 - 11 the Next Tokencode Mode requests are transmitted four times on a five second timeout.  There is no response seen from the Authentication Manager server.  
When the same packets go through the NAT translation agent is translated to and the Authentication Manager server remains
server side 
The problem is that NAT has a time-out for UDP translated packets  When that time-out is exceeded, the source UDP port is changed.  In this example the NAT changes the port from 60400 to 53640.   
The RSA agent API uses the same source UDP port through the entire authentication process until Next Tokencode Mode is entered.  The problem is that the RSA server uses the source UDP port as part of the session ID to keep track that this Next Tokencode request is related to the previous tokencode.  
When the source UDP port changes, Authentication Manager flags this packet as rogue and drops it from any authentication request in case it is part of a DOS attack.  This action will place the "SessionID cannot be null" entry into the imsTrace.log.  Since the request is dropped, there will be no entry for the transaction in the authentication activity log.
F5 confirms that this UDP timeout is about 30 seconds in version 11.5 and about 60 seconds in version 11.6.
ResolutionThere may be Two/Three ways to fix this;
1.    Upgrade to v11.6
2.    Configure either the ‘preserve strict’ or ‘SNAT automap’ on the F5 – Support links below. 
3.    Last resort, you might need an iRule to do this since it’s UDP and not TCP
Review the articles below to avoid using NAT or to increase the NAT UDP time-out.  On the F5 device you can use SNAT (Secure Network Address Translation) automap or the option to preserve the strict value configured for the source port.  SNAT automap is what people usually try first
The F5 version 11.6 will give you a longer UDP NAT time-out, at just over 60 seconds which may be enough time to avoid this.
WorkaroundThis problem is only seen when there is a Network Address Translation (NAT) router or firewall between the F5 agent and the Authentication Manager server(s).  This problem may be related to the RSA agent API, as it was reproduced by F5 with the RSA Authentication Agent for Windows, but only if NAT was enabled between the authentication agent and the Authentication Manager server(s).  F5 used a pfSense router/firewall for the NAT, so one work-around is to avoid using NAT between an F5 and the Authentication Manager server(s).
F5 also reports that version 11.5.x gives about a 35 second window between entering the PIN and entering TokenCode for OnDemand, but that window increases to 60 seconds with F5 version 11.6.   A second work-around would be to upgrade the F5 to version 11.6 to have the larger 60 second time-out.
Users can close the browser after entering their PIN when the Enter Next Tokencode prompt first appears then open a new F5 login session and enter that tokencode instead of their PIN.  This will not be considered a tokencode reuse because the first time the tokencode was sent without a valid session ID, so it never was processed by the Authentication Manager server and therefore, it has not been used.  This situation shows the difference between the time-out to enter an On Demand code – the 35-60 second F5 situation described in the KB vs. the normal 2-3 minute window of a valid tokencode for Next Tokencode Mode, and the expiration time-out of the On Demand code itself, which is often 30 minutes.  In other words, the tokencode is still good even when the login fails this way or especially when the login fails this way and is still good for up to 30 minutes.  Alternatively, users can be told to enter their tokenvodes as fast as they can.
Related work-arounds for other F5 problems include:
  • Error: "103: Invalid Browser State. Either your browser does not support HTTP Cookies or you took too long to respond to the next TokenCode request" when using RSA Authentication Agents in New PIN Mode.
  • The Persistence setting on the F5 BigIP is turned off.
  • The load balancer was not maintaining state during the multi-transaction.
To correct these issues set the load balancer 's Persistence setting to On to maintain state.
NotesScreen caps of a standard authentication are shown below.  Looking at the data captured from a tcpdump filtered for the UDP 5500 authentication traffic we see the only readable byte is at Frame 45 (offset 0x2A), the first byte of UDP data payload.  The highlighted information is 67, a time request to the Authentication Manager server.  The rest of the payload is encrypted.   
User-added image

From the agent side we see 5B, a lock request in Frame 47 which prepares the Authentication Manager server(s) for an authentication request.  This enables the adjudicator service to send the authentication request to only one server in the deployment, which prevents passcode reuse attacks.
User-added image
In the next screen cap, Frame 49 shows an authentication request (5C).  In this trace the is PIN being entered.
Authentication Request 5C
Frame 67 shows the server response (6C).
Server Response 6C
A packet capture showing the issue with the F5 is below:

FrameTimeSecurID Message TypeAction
The agent starts the communication using UDP port 50478   
714:31:185BLock request sent from agent to Authentication Manager server
814:31:186CAuthentication Manager server response to agent
914:31:185CAuth request /PIN [email sent 14:31:18.1 to <UserID@email>] No more emails sent after this
1014:31:206CServer response.  The server sits on responses for two seconds
40 seconds later the agent is using source UDP port 2041   
1114:32:0062Auth/NTC (No reply, assume the Authentication Manager imsTrace.log shows the “SessionID cannot be null” message while the authentication activity monitor is blank)
3.5 seconds later the agent is now using source UDP port 36211   
1214:32:045BLock request sent from agent to Authentication Manager server
1314:32:046CAuthentication Manager server response to agent (assume this is to the lock request)
1414:32:045CAuth/PIN (Do not recall entering this)
1 second later (5 seconds after entering Next Tokencode Mode) the agent is now using UDP port 2041    
1514:32:0462Auth/Next Tokencode request
2 seconds later the agent is using UDP port 36211   
1614:32:066CAuthentication Manager server
5 second timeouts, agent now using UDP port 2041   
1714:32:1062Auth/Next Tokencode request
1814:32:1562Auth/Next Tokencode request
1914:32:2062Auth/Next Tokencode request
26 seconds later the agent is again using UDP port 36211    
2014:32:4662Auth/Next Tokencode request
2114:32:486CAuthentication Manager server response:  fail

Note that message ID 62 is what is used to define a request for Next Tokencode or NTC, which is how On-Demand Authentication works:  
  1. The user enters their PIN first.
  2. The Tokencode is delivered via email or SMS
  3. The user enters the tokencode, which Authentication Manager treats as a token in Next Tokencode Mode (because with regular tokens the first packet would have both PIN and tokencode, and would only prompt for NTC if the user failed some previous logins or the tokencode was close to, but not exactly within the plus/minus 1 minute value of the expected tokencode for the right now minute.