|Applies To||RSA Product Set: Adaptive Authentication (OnPrem)|
RSA Product/Service Type: Adaptive Authentication (OnPrem)
RSA Version/Condition: 7.1
O/S Version: 2008 Server R2 Standard (64 bit)
|Issue||Customer reported regards to the Performance issue when they deactivated the rule. RSA Support took that issue in order to see what happens with the issue. The issue was reproduced in their environments with RSA and initially they had the issue when the Rule was deactivated with normal traffic, they were having Performance Degradation. This issue was investigated for our internal teams with the Logs provided by them running the traces at DB level.|
This issue was investigated for our internal teams with the Logs provided by customer running the traces at DB level.
Support/Engineering Team RSA
- Our Engineering and Support Team analyzed the aa_server_log.2016-04-12 on the issue reported for higher response time after disabling the rule 5 followed by the restart of the Application Server.
- Performance Engineer checked the logs and analyzed that the response time for Analyze call was much higher even before disabling the rule and the restart of the Application Server.
- As per the graphic below, is possible see the response time was higher before disabling the rule.
- Only few occurrences are shown in the above table, please refer to below chart for the performance.
|Tasks||Before to find the Pattern Observed, our Engineering and Support team made the exhaustive analysis with the logs provided and these assumptions were shared:|
- On 04/18/2016 Engineering confirmed that Slow Performance is NOT due to the RULE deactivated.
- On 05/25/2016 Engineering team confirmed that there was consistently high response time during 17:00 – 18:00 hours.
- On 06/22/2016 Engineering analyzed all the Logs, DB traces and thread dumps from the test environment of Galicia, where our team does not see any difference in response time with or without RULE deactivated/disablement.
- There is a very high response time observed in the environment, it could be for the following reasons:
- DB Traces were enabled.
- Environment is not powerful enough to take the load.
- DB traces shows that all the queries in the Test Window took more time (~5 – 11 seconds) (Environment issue – resources issues or load was more with DB traces enabled).
- There is no problem in Application Layer, thread dump analysis shows NO threads waiting on any condition or blocked.
- Incident: High response time in the SQL statements
- Our Engineering Team suggest after suspending a RULE, it is not required to restart the Application Servers because the Policy Changes would be effective after few minutes. The policy changes need some time to get refreshed in the cache and then it will become effective.
- Our Engineering Team recommends highly moving the Adaptive Authentication (AA) Database to a Dedicated Database server that should conveniently take the given workload. Since that Galicia confirmed last week there are other Application Databases hosted on the Database Server along with the Adaptive Authentication (AA). As the small hiccup on the Application server side had a ripple effect on the Database Server, which caused a Big Impact on the Database Performance
- Date and Time: June 16, 2016 from 17:19 -17:57
- Impact: Performance Degradation in DB
Note: SQL Profiler traces are available only from 17:19 hrs
- From the Final Testing on June 16, RSA Engineering analyzed the SQL Profile traces and found that the SQL statements that are relevant to analyze calls and all the other statements were showing very HIGH response time.
- This behavior was observed from 17:10 hrs before the rule suspension and continued until the end of the testing.
- The high response time pattern is NOT caused by Rule disablement.
Database maintenance activities.
- Our Engineering Team suggest the following database maintenance activities that are recommended for better performance:
Regular Monitor and De-fragment tables and indexes.
This is because cleanup activities fragment the tables and indexes, which leads to performance degradation.
Use Index reorganization or Rebuild Methods.
2. Collect and update:
Collect and update the statistics for tables and Index daily during the off-peak hours (with at least 10% sampling).
Frequency of index: DAILY
3. Re-size Data File:
Proactively re-size the data files to avoid the auto extension on-fly.
This will reduce the recursive SQLs, which may cause the spikes.
4. Ensure all the DB Maintenance activities do not overlap with the Offline task window and among themselves.
DB Maintenance can be scheduled using: “Maintenance Plan Wizards”.
|Notes||RSA Engineering suspects the following reasons that is causing the Performance Degrade:|
- Restart of the Application servers might have queued up the client request and caused the sudden increase in the workload.
- Some of many applications Databases apart from AA Database might have had the resource contention with AA Database.