Based on the following document this error should have been fixed with 6.9.1 P11:
We see it appearing more often in 7.0.2 with our Websphere 8.5 environment.
Is there anything we can do to get this error fixed by configuration?
Based on the following document this error should have been fixed with 6.9.1 P11:
We see it appearing more often in 7.0.2 with our Websphere 8.5 environment.
Is there anything we can do to get this error fixed by configuration?
I completely agree with you.
As mentioned in the other discussions we just have the problem to identify what is causing the primary failure.
Therefore we opened cases and community discussions with every related log message.
From my current perspective I think that the primary cause unfortunately does not appear in the logs and because workflows are not failing predictable at a specific node we dont know how to proceed.
That is a typical presentation if you have a workflow that is looping. The actual workflow that is causing the problem has CPU affinity and does not fail as it is running and has recently allocated memory. All other workflows running at the same time fail unpredictably and in different nodes because they are deprived from the resources needed to complete.
These problems are notoriously hard to troubleshoot and require techniques that are best done through a support case which I see you have open.
This error occurs any time the transaction rollback queue is over capacity. It is a secondary failure and can be expected any time there is an unexpectedly large volume of failed change requests that need to be rolled back. The transaction rollback process was optimized for WebSphere and WebLogic in 6.9.1 P11 but this only makes the process more efficient in normal operation, and it does not cover all potential failure situations.
You must look elsewhere for the initial failure that is causing the workflows to fail.
Not that typically by the time you get this message the sever is in a complete failure state, and all workflows are failing, so the workflows listed in this error message are of no relevance.
Typically when customers see this error they have a bad workflow that is failing chronically with every request and this is consuming all resources. Typically it is also a workflow that is of no business consequence or it would have already been identified and resolved. Never ignore workflows that are failing; the failure process is very expensive in terms of CPU, memory, and database resource.