000034993 - WebSphere clustered server node causes system outages due to 'ORA-02049: timeout: distributed transaction waiting for lock'  and ''wpEventQueue ...is not available' errors in RSA Identity Governance & Lifecycle 

Document created by RSA Customer Support Employee on Apr 3, 2017Last modified by RSA Customer Support on Feb 12, 2020
Version 2Show Document
  • View in full screen mode

Article Content

Article Number000034993
Applies ToRSA Product Set: RSA Identity Governance & Lifecycle 
RSA Product/Service Type: Enterprise Software
RSA Version/Condition: All
Platform: IBM WebSphere
 
IssueUsers are unable to login or access the RSA Identity Governance & Lifecycle application on a multi-node WebSphere cluster that was previously working.

After restarting the RSA Identity Governance & Lifecycle application, the following errors are noted in the aveksaServer.log for one of the servers.
 

09/15/2016 11:51:11.475 INFO  (server.startup : 2) [com.aveksa.server.core.Container] Starting Service: MessagePublisher
09/15/2016 11:51:17.624 ERROR (server.startup : 2) [com.aveksa.server.message.MessagePublisher] Unable to create JMS session
09/15/2016 11:51:17.625 ERROR (server.startup : 2) [SystemErr] javax.jms.JMSException: CWSIA0241E:
An exception was received during the call to the method JmsManagedConnectionFactoryImpl.createConnection:
com.ibm.websphere.sib.exception.SIResourceException: CWSIT0088E:
There are currently no messaging engines in bus wpBus running.
Additional failure information: CWSIT0103E:
No messaging engine was found that matched the following parameters:

bus=wpBus, targetGroup=null, targetType=BusMember, targetSignificance=Preferred,
transportChain=InboundBasicMessaging, proximity=Bus..


After shutting down this node, the application runs without issue on the remaining cluster nodes.

Note the aveksaServer.log, SystemOut.log and SystemErr.log on Websphere may be found in a directory similar to the following (where the specific node name would be different), /home/oracle/IBM/WebSphere/AppServer/profiles/AppSrv01/installed/Apps/vm-support-11Node01Cell/aveksa.ear/aveksa.war/log.

All nodes in the WebSphere clustered environment have the same wpBus configuration and Message store configuration (except for the directory path). The Message store type being used across all nodes in the clustered environment is File store. For example,


wpBus Configuration:
Initial State: Started
Message store type: File store
High message threshold per message point: 50000 messages
Default blocked destination retry interval: 5000 milliseconds

Message Store:
Log size 500 MB
Minimum permanent store size 500 MB (Unlimited permanent store size)
Maximum permanent store size 500 MB
Minimum temporary store size 500 MB (Unlimited temporary store size)
Maximum temporary store size 500 MB


Problem Node:



In addition to the above errors, the aveksaServer.log file for the problematic node will have the following additional errors:




09/15/2016 10:40:24.309 ERROR (SIBJMSRAThreadPool : 8) [com.aveksa.server.message.MessageSubscriber] Listener threw during Message notification, for listener AuthorizationServiceProvider
java.lang.RuntimeException: Illegal TXN State: Cannot commit once a rollback begins. Txn count=1
    at com.aveksa.server.db.persistence.PersistenceServiceProvider.commitTransaction(PersistenceServiceProvider.java:2536)
    at com.aveksa.server.db.persistence.PersistenceServiceProvider.commitTransaction(PersistenceServiceProvider.java:2526)
    at com.aveksa.server.db.persistence.PersistenceServiceProvider.closeJDBCQuery(PersistenceServiceProvider.java:3250)
    at com.aveksa.server.db.persistence.PersistenceServiceProvider.executeJDBCQueryInteger(PersistenceServiceProvider.java:3374)
    at com.aveksa.server.db.PersistenceManager.executeJDBCQueryInteger(PersistenceManager.java:496)
    at com.aveksa.server.authorization.XXAuthorizationServiceProvider.updateImplicitBusinessSourceOwnerEntitlements(XXAuthorizationServiceProvider.java:2009)
    at com.aveksa.server.authorization.XXAuthorizationServiceProvider.internalRefreshAuthorizationData(XXAuthorizationServiceProvider.java:2650)
    at com.aveksa.server.authorization.XXAuthorizationServiceProvider.notifyMessage(XXAuthorizationServiceProvider.java:2581)
    at com.aveksa.server.message.MessageSubscriberProvider.distributeMessage(MessageSubscriberProvider.java:78)
    at com.aveksa.server.message.SubscriberMDB.onMessage(SubscriberMDB.java:78)
    at com.ibm.ejs.container.WASMessageEndpointHandler.invokeJMSMethod(WASMessageEndpointHandler.java:138)
    at com.ibm.ws.ejbcontainer.mdb.MessageEndpointHandler.invokeMdbMethod(MessageEndpointHandler.java:1146)
    at com.ibm.ws.ejbcontainer.mdb.MessageEndpointHandler.invoke(MessageEndpointHandler.java:844)
    at com.sun.proxy.$Proxy27.onMessage(Unknown Source)
    at com.ibm.ws.sib.api.jmsra.impl.JmsJcaEndpointInvokerImpl.invokeEndpoint(JmsJcaEndpointInvokerImpl.java:233)
    at com.ibm.ws.sib.ra.inbound.impl.SibRaDispatcher.dispatch(SibRaDispatcher.java:919)
    at com.ibm.ws.sib.ra.inbound.impl.SibRaSingleProcessListener$SibRaWork.run(SibRaSingleProcessListener.java:592)
    at com.ibm.ejs.j2c.work.WorkProxy.run(WorkProxy.java:668)
    at com.ibm.ws.util.ThreadPool$Worker.run(ThreadPool.java:1865)

09/13/2016 00:05:19.472 ERROR (WebContainer : 35) [com.aveksa.UI] com.aveksa.gui.pages.admin.workflow.workitem.WorkflowWorkItemPageData.<init>(WorkflowWorkItemPageData.java:84) -
com.aveksa.server.workflow.WorkflowServiceException: javax.ejb.EJBTransactionRolledbackException: Transaction rolled back; nested exception is: javax.transaction.TransactionRolledbackException: Transaction is ended due to timeout
    at com.aveksa.server.workflow.WorkflowWorkItem.open(WorkflowWorkItem.java:2164)
    at com.aveksa.gui.objects.workflow.GuiWorkflowWorkItem.open(GuiWorkflowWorkItem.java:233)
    at com.aveksa.gui.pages.admin.workflow.workitem.WorkflowWorkItemPageData.<init>(WorkflowWorkItemPageData.java:82)
    at sun.reflect.GeneratedConstructorAccessor212.newInstance(Unknown Source)
    at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:39)
    at java.lang.reflect.Constructor.newInstance(Constructor.java:527)
    at com.aveksa.gui.pages.PageManager.makeNewPage(PageManager.java:491)
    at com.aveksa.gui.pages.PageManager.handleRequest(PageManager.java:344)
    at com.aveksa.gui.pages.PageManager.handleRequest(PageManager.java:254)
    at com.aveksa.gui.core.MainManager.handleRequest(MainManager.java:176)
    at com.aveksa.gui.core.MainManager.doGet(MainManager.java:125)
    at javax.servlet.http.HttpServlet.service(HttpServlet.java:575)
    at javax.servlet.http.HttpServlet.service(HttpServlet.java:668)
    at com.ibm.ws.webcontainer.servlet.ServletWrapper.service(ServletWrapper.java:1230)
    at com.ibm.ws.webcontainer.servlet.ServletWrapper.handleRequest(ServletWrapper.java:779)
    at com.ibm.ws.webcontainer.servlet.ServletWrapper.handleRequest(ServletWrapper.java:478)
    at com.ibm.ws.webcontainer.servlet.ServletWrapperImpl.handleRequest(ServletWrapperImpl.java:178)
    at com.ibm.ws.webcontainer.filter.WebAppFilterChain.invokeTarget(WebAppFilterChain.java:136)
    at com.ibm.ws.webcontainer.filter.WebAppFilterChain.doFilter(WebAppFilterChain.java:97)
    at com.aveksa.gui.core.filters.LoginFilter.doFilter(LoginFilter.java:67)
    at com.ibm.ws.webcontainer.filter.FilterInstanceWrapper.doFilter(FilterInstanceWrapper.java:195)
    at com.ibm.ws.webcontainer.filter.WebAppFilterChain.doFilter(WebAppFilterChain.java:91)
    at com.aveksa.gui.util.security.XSSFilter.doFilter(XSSFilter.java:20)
    at com.ibm.ws.webcontainer.filter.FilterInstanceWrapper.doFilter(FilterInstanceWrapper.java:195)
    at com.ibm.ws.webcontainer.filter.WebAppFilterChain.doFilter(WebAppFilterChain.java:91)
    at com.ibm.ws.webcontainer.filter.WebAppFilterManager.doFilter(WebAppFilterManager.java:964)
    at com.ibm.ws.webcontainer.filter.WebAppFilterManager.invokeFilters(WebAppFilterManager.java:1104)
    at com.ibm.ws.webcontainer.servlet.CacheServletWrapper.handleRequest(CacheServletWrapper.java:87)
    at com.ibm.ws.webcontainer.WebContainer.handleRequest(WebContainer.java:914)

09/15/2016 11:08:06.833 ERROR (Worker_actionq#ActionQ1#WPDS_14) [com.aveksa.server.workflow.scripts.WorkflowContextImpl] Error Completing WorkItem: com.aveksa.server.db.PersistenceException: com.ibm.websphere.ce.cm.StaleConnectionException: No more data to read from socket
...
com.aveksa.server.workflow.WorkflowServiceException: com.aveksa.server.db.PersistenceException: com.ibm.websphere.ce.cm.StaleConnectionException: No more data to read from socket
    at com.aveksa.server.workflow.scripts.WorkflowContextImpl.setCompletionInformation(WorkflowContextImpl.java:1066)
    at com.aveksa.server.workflow.scripts.WorkflowContextImpl.completeWorkItem(WorkflowContextImpl.java:954)
    at com.aveksa.server.workflow.scripts.WorkflowContextImpl.completeWorkItem(WorkflowContextImpl.java:897)
    at com.aveksa.server.workflow.scripts.WorkflowContextImpl.completeWorkItem(WorkflowContextImpl.java:1079)
    at com.aveksa.server.workflow.scripts.nodes.BaseWorkflowNode.nodeAvailableAsynchronous(BaseWorkflowNode.java:67)
    at com.aveksa.server.workflow.scripts.nodes.SubprocessNode.nodeAvailableAsynchronous(SubprocessNode.java:41)
    at com.aveksa.server.workflow.scripts.nodes.WorkflowNodeHandler.nodeAvailableAsynchronous(WorkflowNodeHandler.java:55)
    at sun.reflect.GeneratedMethodAccessor200.invoke(Unknown Source)
    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:37)
    at java.lang.reflect.Method.invoke(Method.java:611)
    at com.workpoint.server.script.StatementEngineJava.execute(Unknown Source)
    at com.workpoint.server.script.ScriptEngine.A(Unknown Source)
    at com.workpoint.server.script.ScriptEngine.execute(Unknown Source)
    at com.workpoint.server.monitor.ActionMonitorHelper.A(Unknown Source)
    at com.workpoint.server.monitor.ActionMonitorHelper.execute(Unknown Source)
    at com.workpoint.server.pojo.ScriptExecAsyncPvtBean.executeScriptMonitor(Unknown Source)
    at com.workpoint.server.pojo.EJSRemote0SLScriptExecAsyncPvt_EJB_8b5c6ed5.executeScriptMonitor(EJSRemote0SLScriptExecAsyncPvt_EJB_8b5c6ed5.java)
    at com.workpoint.server.pojo._ScriptExecAsyncPvt_Stub.executeScriptMonitor(_ScriptExecAsyncPvt_Stub.java:1)
    at com.workpoint.client.Monitor.executeScriptMonitor(Unknown Source)
    at com.workpoint.queue.work.ActionQWorker.A(Unknown Source)
    at com.workpoint.queue.work.ActionQWorker.run(Unknown Source)
    at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:895)
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:918)
    at java.lang.Thread.run(Thread.java:784)
Caused by:
com.aveksa.server.db.PersistenceException: com.ibm.websphere.ce.cm.StaleConnectionException: No more data to read from socket

    at com.aveksa.server.db.persistence.PersistenceServiceProvider.runStoredProcedure(PersistenceServiceProvider.java:1458)
    at com.aveksa.server.db.persistence.PersistenceServiceProvider.runStoredProcedure(PersistenceServiceProvider.java:1329)
    at com.aveksa.server.db.PersistenceManager.runStoredProcedure(PersistenceManager.java:235)
    at com.aveksa.server.workflow.scripts.WorkflowContextImpl.setCompletionInformation(WorkflowContextImpl.java:1064)
    ... 23 more
Caused by:
com.ibm.websphere.ce.cm.StaleConnectionException: No more data to read from socket

    at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
    at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:56)
    at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:39)
    at java.lang.reflect.Constructor.newInstance(Constructor.java:527)
    at com.ibm.websphere.rsadapter.GenericDataStoreHelper.mapExceptionHelper(GenericDataStoreHelper.java:626)
    at com.ibm.websphere.rsadapter.GenericDataStoreHelper.mapException(GenericDataStoreHelper.java:685)
    at com.ibm.ws.rsadapter.AdapterUtil.mapException(AdapterUtil.java:2267)
    at com.ibm.ws.rsadapter.jdbc.WSJdbcUtil.mapException(WSJdbcUtil.java:1191)
    at com.ibm.ws.rsadapter.jdbc.WSJdbcPreparedStatement.execute(WSJdbcPreparedStatement.java:635)
    at com.aveksa.server.db.persistence.PersistenceServiceProvider.runStoredProcedure(PersistenceServiceProvider.java:1432)
    ... 26 more


The SystemOut.log file for the problematic node will have the following errors:




[15/09/16 11:33:38:144 EST] 0000013f SystemOut     O 2016-09-15 11:33:38,144 [Thread-137] INFO  com.workpoint.server.ServerProperties  - ServerProperties.setProperty() invoked for property= calculated.db.offset.millis, value=0
[15/09/16 11:33:39:775 EST] 0000013e SibMessage    I   [:] CWSIP0555W: The Remote Message Point on ME App2Node01.App2-wpBus for destination wpEventQueue, localized at 1CB1FB9AB589D497 has reached its message depth high threshold.
[15/09/16 11:33:44:793 EST] 0000013e SystemOut     O 2016-09-15 11:33:44,793 [Thread-136] ERROR com.workpoint.server.monitor.MonitorHelper  - Exception occurred attempting to queue a Monitor Started Message for monitor type actionq#ActionQ1#881
javax.jms.JMSException: CWSIA0067E: An exception was received during the call to the method JmsMsgProducerImpl.sendMessage (#4): com.ibm.ws.sib.processor.exceptions.SIMPLimitExceededException: CWSIK0025E: The destination wpEventQueue on messaging engine App2Node01.App2-wpBus is not available because the high limit for the number of messages for this destination has already been reached..

    at com.ibm.ws.sib.api.jms.impl.JmsMsgProducerImpl.sendMessage(JmsMsgProducerImpl.java:1346)
    at com.ibm.ws.sib.api.jms.impl.JmsMsgProducerImpl.send(JmsMsgProducerImpl.java:736)
    at com.workpoint.common.util.JMSUtils.sendQueueMessage(Unknown Source)
    at com.workpoint.common.util.JMSUtils.sendQueueMessage(Unknown Source)
    at com.workpoint.server.monitor.MonitorHelper.monitorStarted(Unknown Source)
    at com.workpoint.server.pojo.MonitorPvtBean.addMonitor(Unknown Source)
    at com.workpoint.server.pojo.EJSRemote0SLMonitorPvt_EJB_65486d32.addMonitor(EJSRemote0SLMonitorPvt_EJB_65486d32.java)
    at com.workpoint.server.pojo._MonitorPvt_Stub.addMonitor(_MonitorPvt_Stub.java:1)
    at com.workpoint.client.Monitor.addMonitor(Unknown Source)
    at com.workpoint.queue.core.QMonitor.H(Unknown Source)
    at com.workpoint.queue.core.QMonitor.startMonitor(Unknown Source)
    at com.workpoint.queue.WpQMonitors$_A.run(Unknown Source)
    at java.util.Timer$TimerImpl.run(Timer.java:296)
Caused by:
com.ibm.ws.sib.processor.exceptions.SIMPLimitExceededException: CWSIK0025E: The destination wpEventQueue on messaging engine App2Node01.App2-wpBus is not available because the high limit for the number of messages for this destination has already been reached.
    at com.ibm.ws.sib.processor.impl.PtoPInputHandler.checkHandlerAvailable(PtoPInputHandler.java:2969)
    at com.ibm.ws.sib.processor.impl.PtoPInputHandler.internalHandleMessage(PtoPInputHandler.java:532)
    at com.ibm.ws.sib.processor.impl.PtoPInputHandler.handleProducerMessage(PtoPInputHandler.java:283)
    at com.ibm.ws.sib.processor.impl.ProducerSessionImpl.send(ProducerSessionImpl.java:643)
    at com.ibm.ws.sib.api.jms.impl.JmsMsgProducerImpl.sendMessage(JmsMsgProducerImpl.java:1277)
... 12 more

[15/09/16 11:53:45:146 EST] 00000172 SystemOut     O 2016-09-15 11:53:45,145 [Worker_alertq#AlertQ1#WPDS_2] ERROR com.workpoint.server.recordset.SmartStatement  - SQLException caught
java.sql.SQLSyntaxErrorException: ORA-02049: timeout: distributed transaction waiting for lock

    at oracle.jdbc.driver.T4CTTIoer.processError(T4CTTIoer.java:440)
    at oracle.jdbc.driver.T4CTTIoer.processError(T4CTTIoer.java:396)
    at oracle.jdbc.driver.T4C8Oall.processError(T4C8Oall.java:837)
    at oracle.jdbc.driver.T4CTTIfun.receive(T4CTTIfun.java:445)
    at oracle.jdbc.driver.T4CTTIfun.doRPC(T4CTTIfun.java:191)
    at oracle.jdbc.driver.T4C8Oall.doOALL(T4C8Oall.java:523)
    at oracle.jdbc.driver.T4CPreparedStatement.doOall8(T4CPreparedStatement.java:207)
    at oracle.jdbc.driver.T4CPreparedStatement.executeForRows(T4CPreparedStatement.java:1010)
    at oracle.jdbc.driver.OracleStatement.doExecuteWithTimeout(OracleStatement.java:1315)
    at oracle.jdbc.driver.OraclePreparedStatement.executeInternal(OraclePreparedStatement.java:3576)
    at oracle.jdbc.driver.OraclePreparedStatement.executeUpdate(OraclePreparedStatement.java:3657)
    at oracle.jdbc.driver.OraclePreparedStatementWrapper.executeUpdate(OraclePreparedStatementWrapper.java:1350)
    at com.ibm.ws.rsadapter.jdbc.WSJdbcPreparedStatement.pmiExecuteUpdate(WSJdbcPreparedStatement.java:1187)
    at com.ibm.ws.rsadapter.jdbc.WSJdbcPreparedStatement.executeUpdate(WSJdbcPreparedStatement.java:804)
    at com.workpoint.server.recordset.SmartStatement.executeUpdate(Unknown Source)
    at com.workpoint.server.recordset.RecordSet.executeUpdate(Unknown Source)
    at com.workpoint.server.recordset.WP_PROCI_CONTROL.update(Unknown Source)
    at com.workpoint.server.job.Job.touchRootParent(Unknown Source)
    at com.workpoint.server.job.Job.touchRootParent(Unknown Source)
    at com.workpoint.server.pojo.JobUpdatePvtBean.evaluateData(Unknown Source)
    at com.workpoint.server.pojo.EJSRemote0SLJobUpdatePvt_EJB_5fb93aca.evaluateData(EJSRemote0SLJobUpdatePvt_EJB_5fb93aca.java)
    at com.workpoint.server.pojo._JobUpdatePvt_Stub.evaluateData(_JobUpdatePvt_Stub.java:1)
    at com.workpoint.client.Job.evaluate(Unknown Source)
    at com.aveksa.server.workflow.RetryEnabledOperationsJobUtils$JobEvaluateStrategy.execute(RetryEnabledOperationsJobUtils.java:345)
    at com.aveksa.server.workflow.RetryEnabledOperationsJobUtils.executeJobStrategyWithProcessing(RetryEnabledOperationsJobUtils.java:169)
    at com.aveksa.server.workflow.RetryEnabledOperationsJobUtils.evaluateWithProcessing(RetryEnabledOperationsJobUtils.java:137)
    at com.aveksa.server.workflow.scripts.nodes.EscalationHandler.createEscalationNode(EscalationHandler.java:294)
    at com.aveksa.server.workflow.scripts.nodes.EscalationHandler.escalate(EscalationHandler.java:113)
    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:60)
    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:37)
    at java.lang.reflect.Method.invoke(Method.java:611)
    at com.workpoint.server.script.StatementEngineJava.execute(Unknown Source)
    at com.workpoint.server.script.ScriptEngine.A(Unknown Source)
    at com.workpoint.server.script.ScriptEngine.execute(Unknown Source)
    at com.workpoint.server.monitor.AlertMonitorHelper.A(Unknown Source)
    at com.workpoint.server.monitor.AlertMonitorHelper.A(Unknown Source)
    at com.workpoint.server.monitor.AlertMonitorHelper.A(Unknown Source)
    at com.workpoint.server.monitor.AlertMonitorHelper.doExecute(Unknown Source)
    at com.workpoint.server.monitor.AlertMonitorHelper.execute(Unknown Source)
    at com.workpoint.server.pojo.AlertPvtBean.executeAlertMonitor(Unknown Source)
    at com.workpoint.server.pojo.EJSRemote0SLAlertPvt_EJB_0fb3523f.executeAlertMonitor(EJSRemote0SLAlertPvt_EJB_0fb3523f.java)
    at com.workpoint.server.pojo._AlertPvt_Stub.executeAlertMonitor(_AlertPvt_Stub.java:1)
    at com.workpoint.client.Monitor.executeAlertMonitor(Unknown Source)
    at com.workpoint.queue.work.AlertQWorker.A(Unknown Source)
    at com.workpoint.queue.work.AlertQWorker.run(Unknown Source)
    at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:895)
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:918)
    at java.lang.Thread.run(Thread.java:784)


All Nodes



Below are occurrences of the errors in the aveksaServer.log for ALL nodes (including the problematic node):




09/15/2016 10:53:28.328 ERROR (SIBJMSRAThreadPool : 5) [org.hibernate.transaction.JDBCTransaction] Could not toggle autocommit
java.sql.SQLException: DSRA9350E: Operation setAutoCommit is not allowed during a global transaction.

    at com.ibm.ws.rsadapter.jdbc.WSJdbcConnection.setAutoCommit(WSJdbcConnection.java:3504)
    at org.hibernate.transaction.JDBCTransaction.toggleAutoCommit(JDBCTransaction.java:224)
    at org.hibernate.transaction.JDBCTransaction.rollbackAndResetAutoCommit(JDBCTransaction.java:216)
    at org.hibernate.transaction.JDBCTransaction.rollback(JDBCTransaction.java:192)
    at com.aveksa.server.db.persistence.PersistenceServiceProvider.cleanTransaction(PersistenceServiceProvider.java:2593)
    at com.aveksa.server.db.persistence.PersistenceServiceProvider.cleanTransaction(PersistenceServiceProvider.java:2569)
    at com.aveksa.server.db.PersistenceManager.cleanTransaction(PersistenceManager.java:416)
    at com.aveksa.server.workflow.scripts.split.ContextObject.getChangeRequestItemIds(ContextObject.java:1034)
    at com.aveksa.server.workflow.scripts.action.ConditionAction.evaluateChangeRequestConditions(ConditionAction.java:133)
    at com.aveksa.server.workflow.scripts.action.ConditionAction.evaluate(ConditionAction.java:76)
    at sun.reflect.GeneratedMethodAccessor184.invoke(Unknown Source)
    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:37)
    at java.lang.reflect.Method.invoke(Method.java:611)
    at com.workpoint.server.script.StatementEngineJava.execute(Unknown Source)
    at com.workpoint.server.script.ScriptEngine.A(Unknown Source)
    at com.workpoint.server.script.ScriptEngine.execute(Unknown Source)
    at com.workpoint.server.script.ScriptEngine.execute(Unknown Source)
    at com.workpoint.server.job.JobNode.changeState(Unknown Source)
    at com.workpoint.server.job.JobNode.changeState(Unknown Source)
    at com.workpoint.server.job.JobNode.isComplete(Unknown Source)
    at com.workpoint.server.job.Job.changeWorkItemState(Unknown Source)
    at com.workpoint.server.pojo.ServerAutomatedActivityMDBean.autoCompleteWorkItems(Unknown Source)
    at com.workpoint.server.pojo.ServerAutomatedActivityMDBean.autoCompleteWorkItems(Unknown Source)
    at com.workpoint.server.pojo.ServerAutomatedActivityMDBean.onMessage(Unknown Source)
    at com.ibm.ejs.container.WASMessageEndpointHandler.invokeJMSMethod(WASMessageEndpointHandler.java:138)
    at com.ibm.ws.ejbcontainer.mdb.MessageEndpointHandler.invokeMdbMethod(MessageEndpointHandler.java:1146)
    at com.ibm.ws.ejbcontainer.mdb.MessageEndpointHandler.invoke(MessageEndpointHandler.java:844)
    at com.sun.proxy.$Proxy27.onMessage(Unknown Source)
    at com.ibm.ws.sib.api.jmsra.impl.JmsJcaEndpointInvokerImpl.invokeEndpoint(JmsJcaEndpointInvokerImpl.java:233)
    at com.ibm.ws.sib.ra.inbound.impl.SibRaDispatcher.dispatch(SibRaDispatcher.java:919)
    at com.ibm.ws.sib.ra.inbound.impl.SibRaSingleProcessListener$SibRaWork.run(SibRaSingleProcessListener.java:592)
    at com.ibm.ejs.j2c.work.WorkProxy.run(WorkProxy.java:668)
    at com.ibm.ws.util.ThreadPool$Worker.run(ThreadPool.java:1865)

09/15/2016 10:53:28.330 ERROR (SIBJMSRAThreadPool : 5) [org.hibernate.transaction.JDBCTransaction] JDBC rollback failed
java.sql.SQLException: DSRA9350E: Operation Connection.rollback is not allowed during a global transaction.

    at com.ibm.ws.rsadapter.jdbc.WSJdbcConnection.rollback(WSJdbcConnection.java:3350)
    at org.hibernate.transaction.JDBCTransaction.rollbackAndResetAutoCommit(JDBCTransaction.java:213)
    at org.hibernate.transaction.JDBCTransaction.rollback(JDBCTransaction.java:192)
    at com.aveksa.server.db.persistence.PersistenceServiceProvider.cleanTransaction(PersistenceServiceProvider.java:2593)
    at com.aveksa.server.db.persistence.PersistenceServiceProvider.cleanTransaction(PersistenceServiceProvider.java:2569)
    at com.aveksa.server.db.PersistenceManager.cleanTransaction(PersistenceManager.java:416)
    at com.aveksa.server.workflow.scripts.split.ContextObject.getChangeRequestItemIds(ContextObject.java:1034)
    at com.aveksa.server.workflow.scripts.action.ConditionAction.evaluateChangeRequestConditions(ConditionAction.java:133)
    at com.aveksa.server.workflow.scripts.action.ConditionAction.evaluate(ConditionAction.java:76)
    at sun.reflect.GeneratedMethodAccessor184.invoke(Unknown Source)
    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:37)
    at java.lang.reflect.Method.invoke(Method.java:611)
    at com.workpoint.server.script.StatementEngineJava.execute(Unknown Source)
    at com.workpoint.server.script.ScriptEngine.A(Unknown Source)
    at com.workpoint.server.script.ScriptEngine.execute(Unknown Source)
    at com.workpoint.server.script.ScriptEngine.execute(Unknown Source)
    at com.workpoint.server.job.JobNode.changeState(Unknown Source)
    at com.workpoint.server.job.JobNode.changeState(Unknown Source)
    at com.workpoint.server.job.JobNode.isComplete(Unknown Source)
    at com.workpoint.server.job.Job.changeWorkItemState(Unknown Source)
    at com.workpoint.server.pojo.ServerAutomatedActivityMDBean.autoCompleteWorkItems(Unknown Source)
    at com.workpoint.server.pojo.ServerAutomatedActivityMDBean.autoCompleteWorkItems(Unknown Source)
    at com.workpoint.server.pojo.ServerAutomatedActivityMDBean.onMessage(Unknown Source)
    at com.ibm.ejs.container.WASMessageEndpointHandler.invokeJMSMethod(WASMessageEndpointHandler.java:138)
    at com.ibm.ws.ejbcontainer.mdb.MessageEndpointHandler.invokeMdbMethod(MessageEndpointHandler.java:1146)
    at com.ibm.ws.ejbcontainer.mdb.MessageEndpointHandler.invoke(MessageEndpointHandler.java:844)
    at com.sun.proxy.$Proxy27.onMessage(Unknown Source)
    at com.ibm.ws.sib.api.jmsra.impl.JmsJcaEndpointInvokerImpl.invokeEndpoint(JmsJcaEndpointInvokerImpl.java:233)
    at com.ibm.ws.sib.ra.inbound.impl.SibRaDispatcher.dispatch(SibRaDispatcher.java:919)
    at com.ibm.ws.sib.ra.inbound.impl.SibRaSingleProcessListener$SibRaWork.run(SibRaSingleProcessListener.java:592)
    at com.ibm.ejs.j2c.work.WorkProxy.run(WorkProxy.java:668)
    at com.ibm.ws.util.ThreadPool


CauseThe JMS queue of the problematic node was corrupted due to a bad file store.

For example, given a 3-node cluster, App1, App2 and App3, where App2 is the node with the corrupt JMS queue, the following would happen:



  • The moment the problematic node (App2) is started, it tries to replay all the transactions it has stored (which could be thousands) and which have probably already been picked up by another application server long ago. 
  • Because of this, the database is locking thousands of records (one for each transaction being replayed.) The other application servers cannot get any of these rows due to the row lock contention.
  • Because of the corruption, the SQL queries performed by App2 never complete or clear out which causes the below error to be seen in the SystemOut.log of the App2 server as well as the other errors seen in both the aveksaServer.log and the SystemOut.log files.

ORA-02049: timeout: distributed transaction waiting for lock
ResolutionFollow the steps below. This example assumes a 3-node cluster (App1, App2, App3) where App2 is the node with the corrupt JMS queue.
  1. Bring down all the application servers in the WebSphere clustered environment and ensure that there are no connections to the database.
  2. Rename the file store destination folder on the problematic node (App2) and restart the problematic node (App2). For example, rename  /opt/IBM/WebSphere/AppServer/profiles/AppSrv01/filestores/com.ibm.ws.sib/App2-wpBus-8B7CA4D08CCCB229 to App2-wpBus-8B7CA4D08CCCB229.orig<date>.  
  3. After renaming the file store destination folder and restarting App2's application server, its queue file store destination folder will be recreated automatically.
  4. Ensure that activities are going to App2 and monitor this node; i. e., verify that there are no database locking issues. Once verified,
  5. Bring down the application server (App2).
  6. Bring up the primary application server (App1).
  7. Ensure the majority of the requests/activities are going to App1 and monitor this node; i. e., verify that there are no database locking issues. Once verified,
  8. Restart the primary application server (App1).
  9. Start up the second application server (App2).
  10. Start up the third application server (App3).
  11. Allow more requests to be processed and ensure that each of the servers gets hit.
  12. Monitor all servers and observe that there are no database locking issues, There should not be any occurrences of the specific errors in the logs which were symptoms of the issue.
NOTE: There is no known fix to prevent the JMS queue corruption, if a file store is bad.
WorkaroundShutdown the node that is having the issue.
 

Attachments

    Outcomes