|Applies To||RSA Product Set: Identity Governance & Lifecycle|
RSA Version/Condition: 7.1.x
Emails are being generated with Queue Depth Admin errors. For example,
A new admin error has been generated.
The /home/oracle/wildfly-10.1.0.Final/standalone/log/aveksaServer.log has messages like these:
03/27/2018 15:21:22.442 WARN (Worker_eventq#Event Queue#WPDS_16471) [com.aveksa.server.workflow.statistics.QueueDepthProcessor] WARNING: Monitor[alertq#Normal] has taken 18886 ms to process an item. please check logs for details.
03/27/2018 15:29:11.669 WARN (Worker_eventq#Event Queue#WPDS_16471) [com.aveksa.server.workflow.statistics.QueueDepthProcessor] ERROR: Monitor[actionq#Role] has taken 141751 ms to process an item. please check logs for details.
03/27/2018 15:37:12.223 WARN (Worker_eventq#Event Queue#WPDS_16468) [com.aveksa.server.workflow.statistics.QueueDepthProcessor] CRITICAL: Monitor[jobq#Normal] has taken 505538 ms to process an item. please check logs for details.
These messages may be filling up the aveksaServer.log:
Automated workflow tasks run in queues. These messages indicate that workflow queues are backed up for some reason and workflow activities have been in the queues for a while.
Depending on the length of time workflow activities have been waiting in the queue, the message is labeled WARNING, ERROR, or CRITICAL, where CRITICAL indicates the longest wait of the three types.
Common causes are long-running SQL queries, looping workflows, and/or database locks.
Long waits in the Workpoint queues can result in performance degradation.
When these messages are happening, look holistically at the system. Look for a system issue such as long-running queries or database locks. Review your workflow implementation. Look for a workflow issue such as looping workflows. Determine if this is a one-time activity such as an upgrade or a patch in which case these messages may be ignored.
Queues may be monitored from the RSA Identity Governance & Lifecycle User Interface (Admin > Workflow > Monitoring).
For further help, please contact RSA Customer Support.
The wait time thresholds for sending the notifications are controlled by three new variables that were introduced when queue monitoring functionality was implemented. These variables retain notification threshold values for the time a workflow task is allowed to sit in a queue before a notification is sent out and logged to the aveksaServer.log. Threshold values can be set for warnings, errors, or critical notifications. These variables and their default values are:
To reduce the amount of logging to the aveksaServer.log and the quantity of emails generated by these notifications, these default thresholds may need to be increased to more meaningful values based on your system's usage. There is no "right" value to set them at, we are simply looking for high water marks beyond what your regular usage should be hitting. If you are seeing this message due to a single item that is extremely high, then the message is doing it's job correctly, only adjust if you are being flooded with warnings.
To determine the best values for your implementation, in the RSA Identity Governance & Lifecycle User Interface, go to Admin > Admin Errors and add the Details column to the table. Then search the table for the string queue. This should pull up all the recent hits and you should be able to see in the Details column what values in milliseconds (ms) are triggering the notification. If you are seeing more than just WARNING messages, such as ERROR and CRITICAL, then those may need to be adjusted as well.
Once you have determined a meaningful new highwater mark, in the RSA Identity Governance & Lifecycle User Interface, go to Admin > System > Edit and add the following three parameters to the Custom section at the bottom of the screen replacing the X, Y and Z values with the new values you have determined are best for your implementation.