Platform: 7.0.2 P07
Soft-Appliance, Wildfly, Remote DB
In the past, customer had an issue in IG&L 7.0.0 where collecting groups in an ADC took a long time and they decided not to collect the groups at that time. This was during their initial setup and they did not log a case with RSA. They tried to collect groups in an ADC, however it took forever as they had >100,000 groups and these translated to several millions of group-account-user relationships and IGL could not handle the load. So they stopped collecting groups and started to collect only accounts and account mappings.
Now, customer is evaluating if they want to go to IG&L v7.1.1 and they want to understand the capability of both collecting the groups in ADC and the roles in RDC in the newer versions (IG&L v7.1.0 and v7.1.1) to help them make important business decisions.
What is the capability of IG&L 7.1.x to collect groups in ADC and roles in RDC in terms of load and performance?
What were the improvements done on IG&L 7.1.x with regard to collection of collect groups in ADC and roles in RDC in terms of functionality, load and performance?
How many ? How fast is the collection?
Thank you.
I don't know the direct answer to your question, however I believe they need assistance from RSA Professional Services. Not to speed up collections but to look into whether they really need to collect all the data they have or not. In most cases where customer want to collect such large datasets, they don't really need all that data for whatever their use case is.
For example, in your case 100,000 groups sounds like a DAG use case to me, where the groups might be controlling access to certain resources (just a guess). If they want to collect all data related to that, you would end up with a lot of un-needed stuff that would hide the really important things. As Edwin used to say, you might end up collecting information about who has access to files such as the cafeteria food pictures posted every day.
So in the end, the key to such problems is always Data Classification. Once large datasets are properly classified pre-collection, then you know exactly which classifications you need to collect and don't bloat the system with useless data.