|Applies To||RSA Product Set: RSA Identity Governance and Lifecycle|
RSA Version/Condition: 7.0
RSA Via Lifecycle and Governance 7.0 Collection Changes
One of the goals with the RSA Via Lifecycle and Governance 7.0 collector redesign was to make the collection times easier to understand. This does not mean that they are predictable, as the time they take is still dependent on the amount of change in the data that was collected, but the time should be consistent with the percentage of change regardless of the run.
Collection Performance Initial Collection
An initial collection is still expected to take the most time for a collector. During this run it will need to process objects and the direct relationships requiring normalization and resolution as well as create the records in the Master tables.
Collection 5% Change
The expectation should be that a collection with a 5% change in the data should translate to a significant reduction in the collection time as compared to an initial, but not necessarily a 95% reduction. There are many factors that contribute to this such as the amount of data that is collected, the time to do the delta calculations and the time for taking statistics. These will remain somewhat consistent over time.
Collection 10% Change
The expectation should be that a collection with a 10% change in the data should translate to a significant reduction in the collection time as compared to an initial collection, and it should be more than a collection with a 5% change. The time difference may not be linear in nature for the same factors mentioned previously.
Collection Full Refresh
The concept of a full refresh has been around for a few versions now. With 7.0, we have changed the collection wizard to automatically trigger a full refresh with certain types of collection configuration changes. A full refresh is similar to an initial collection in that it treats everything as New and can be done for just the Object or Relationships, or Both. This will result in a longer collection time which should be similar to what was seen with an initial collection in a scenario where both the object and relationships are being refreshed.
Factors in Performance
There are some different configurations which can add some extra time to a collection. One is custom user type attributes. With multi attributes of this type a collection has to iterate over each one individually. Similarly are resolution rules. With multiple resolution rules the collection has to go through each collector one at a time.
The Indirect Explosion process was introduced in the 7.0 release and it is responsible calculating all the indirect or derived entitlements for Users, Accounts, Groups, and Roles in the system. One of the main factors that will influence the processing time is the hierarchy of the relationships. The deeper the nesting the longer it will take to process the indirect entitlements.
Indirect Explosion Initial Collection
The run of an indirect explosion for an initial collection run is typically the longest as it will have to determine all the derived entitlements for all objects in the system. This does not mean just what it collected. For example an Entitlement Data Collector (EDC) collects entitlements for groups, but accounts are members of groups and Users have accounts so it must calculate the derived entitlements for them as well.
Indirect Explosion Subsequent Collection
The run of an indirect explosion for a subsequent run of a collector is typically a shorter amount of time than for an initial collection. As stated previously, some things that can factor into the time are nested objects. Removing an entitlement from an application role is simple, but if that application role is used by another application role and subsequently used in a group, which has multiple accounts as members then the impact becomes a littler larger.
Hypothetical Collection Model
Say there is an Account Data Collector which collects 10,000 accounts and 10,000 mapped users.
If the initial collection takes ten minutes (this is the processing of the data after it has been collected), the expectation is that this is the maximum time the collector would take in any future runs.
So what happens in a subsequent collection?
Let’s use the above collection changes in this example.