000020446 - Passwords not flushed from Authorization Server cache

Document created by RSA Customer Support Employee on Jun 15, 2016Last modified by RSA Customer Support Employee on Apr 21, 2017
Version 2Show Document
  • View in full screen mode

Article Content

Article Number000020446
Applies ToRSA ClearTrust 5.5.3 Authorization Server (AServer)
RSA ClearTrust 5.5.3 Entitlements Server (EServer)
IssueUse granular cache updates to flush user passwords from the authorization server cache
Users who try to authenticate with a password very recently changed via the administrative API are rejected when using their new password, but authenticated when using their old password.  This symptom is intermittent.

There are many possible causes of a failure to update the authorization server caches with sufficient speed to avoid rejecting a user's newer credentials:

  1. The system architecture does not permit sufficiently fast updating due to LAN/WAN latency.
  2. The authorization server thread that receives a message from the entitlements server to poison the cache entry for a particular user runs at a lower priority than request processing threads, in order to avoid frequent cache poisonings impacting server performance.  Under heavy load on the authorization server, there can be a noticeable delay in poisoning the cache.
  3. When granular cache updates are configured for synchronous operation, meaning that the entitlements server waits for a response from the authorization server indicating receipt of the cache poisoning command, a failure on the part of the authorization server to respond can hang the entitlements server thread that communicates the command, preventing further cache poisoning commands from being broadcast.  Restarting the authorization server or the entitlements server clears the specific condition.

    The authorization server can fail to respond to the command to selectively poison its cache when the thread receiving the command crashes.  In one case, the customer was retrieving user properties that included a multi-value property with a very large amount of data (500KB+); this caused the thread to crash with a java.lang.OutOfMemoryError, preventing a response to the entitlements server.

Causes 1 and 2 are architectural issues that need to be addressed as such--reducing network latency or authorization server load can mitigate the issue.  An architectural workaround to avoid problems with the speed of poisoning the cache following a password change is to use the runtime API to create a valid token that is returned to the user after changing their password, allowing them to continue accessing the site without actually authenticating with the new password.  When the user is later challenged in a subsequent session, the cache entry containing the old password has long been purged.

Cause 3 can be mitigated in the entitlements server by using the parameter cleartrust.eserver.runtime.timeout.  This parameter defaults to 0, or no timeout, causing the entitlements server thread that issues cache poisoning commands to wait forever for a response from the authorization server; setting this parameter to a number of seconds greater than zero will cause the entitlements server to time out connections to crashed runtime API threads and allow it to continue broadcasting cache poisoning events.

The entire issue of latency in the cache with respect to passwords can be avoided with LDAP datastores by configuring authentication via an LDAP bind channel.  Passwords are not cached by the authorization server; all authentication attempts are passed directly to the LDAP datastore for an LDAP bind.
WorkaroundRestarting the entitlements server or the authorization server clears the issue.

When console debug output is enabled in the entitlements server (with the flag -DDEBUG), the following output is observed (on the console or in the the debug log) when a granular cache update is broadcast:

10:03:46:355 [*] [APIClientProxy_39] - Fire DataUpdateEvent:
type: priority UPDATE
object: UniqueIdentifier (Data Store Type: LDAP Store ID: Testv1 Class Identifier: 16) DN:  "uid=1224725,ou=users,cn=rsa,cn=com"

The above lines show the password change event in the entitlements server triggering a cache poisoning event that will be broadcast to all listed authorization servers.

10:03:46:478 [*] [Queue_Dispatcher_Thread] - AuthServerConnection: started command . Now have 1 active commands
10:03:46:479 [*] [Queue_Dispatcher_Thread] - SEND MSG: DATA_MSG
10:03:46:595 [*] [InputStream_Reader_Thread] - MSG: DATA_MSG
10:03:46:596 [*] [Queue_Dispatcher_Thread] - Removing MuxStreamBundle
10:03:46:596 [*] [Queue_Dispatcher_Thread] - AuthServerConnection: command finished. Now have 0 active commands.

The above lines are the notification of a single authorization server by the entitlements server; for each authorization server listed for the entitlements server, there will be such a block.  A crashed thread in the authorization server will show up in the entitlements server debug output as a failure to receive the line [InputStreamReader_Thread] - MSG:DATA_MSG, which is the authorization server's normal response.

Corresponding debug output in the authorization server (on the console or in the debug log) showed in the particular case:

10:03:46:718 [*] [InputStream_Reader_Thread] - MSG: DATA_MSG
10:03:46:719 [*] [MUXWORKER-5] - FunctionMapping.createObjectFromFunctionNode(): argCount = 2, myIncludeContextMap = true
10:03:46:719 [*] [MUXWORKER-5] -  getting arg #1
10:03:46:719 [*] [MUXWORKER-5] -  arg #2 (context) = {CLIENT_IP=xxx.xxx.xxx.xxx, groups=false, CLIENT_VERSION=7, props=false, CLIENT_PORT=36384, tokens=false}
10:03:46:719 [*] [MUXWORKER-5] - FunctionMapping.createObjectFromFunctionNode(): about to invoke method public java.util.Map sirrus.authserver.TCPServerAPIAdaptor.clearServerCaches(java.util.Map,java.util.Map) on 2 args

The above lines show the initial receipt of the command to selectively poison the cache, including the uid of the user for whom the cache entry should be flushed.

00:07:23:546 [*] [MUXWORKER-5] - AuthorizationAPI.clearServerCaches( {UniqueIdentifier=QEAAAAETERBUAAAAAZUZXN0djEAAAABEAAAAB91aWQ9RzAwNDMyNyxvdT11c2VycyxvPWJlYXIuY29t, OperationType=UpdateOperation}, {CLIENT_IP=xxx.xxx.xxx.xxx, CLIENT_VERSION=7, CLIENT_PORT=36384, USER_GROUPS_ENABLED=false, TOKENS_ENABLED=false, USER_PROPERTIES_ENABLED=false} ) returning {}
00:07:23:546 [*] [MUXWORKER-5] - TCPServerAPIAdaptor.clearServerCaches( {UniqueIdentifier=QEAAAAETERBUAAAAAZUZXN0djEAAAABEAAAAB91aWQ9RzAwNDMyNyxvdT11c2VycyxvPWJlYXIuY29t, OperationType=UpdateOperation}, {CLIENT_IP=xxx.xxx.xxx.xxx, groups=false, CLIENT_VERSION=7, props=false, CLIENT_PORT=36384, tokens=false} ) returning {}
00:07:23:546 [*] [MUXWORKER-5] -  result: {}
00:07:23:546 [*] [MUXWORKER-5] - SEND MSG: DATA_MSG
00:07:23:547 [*] [MUXWORKER-5] - Removing MuxStreamBundle

The above lines show the actual receipt of the command and the response to the entitlements server that the command was received and added to the request queue for processing (these lines do not indicate that the cache has actually been flushed).  If the thread in question does not reach the point where it reads SEND MSG:DATA_MSG, it means that the thread has not responded to the entitlements server.

Legacy Article IDa34531