History | Log In     View a printable version of the current page.  
HQ 4.0 EE Release is Now Available | HQ 3.2.5-EE Maintenance Release is Now Available
Issue Details (XML | Word | Printable)

Key: HHQ-1115
Type: Bug Bug
Status: Closed Closed
Resolution: Fixed
Priority: Critical Critical
Assignee: Tom Keeney
Reporter: Ryan Morgan
Votes: 0
Watchers: 0
Operations

If you were logged in you would be able to see more operations.
Hyperic HQ

Exception trying to reschedule metrics due to out of date SRN

Created: 11/Oct/07 01:11 PM   Updated: 01/Nov/07 01:26 PM
Component/s: Server
Affects Version/s: 3.1.1
Fix Version/s: 3.2.0, 3.1.4

File Attachments: 1. Text File HHQ_1115patch.txt (0.9 kb)
2. Text File HHQ_1115patch2.txt (15 kb)
3. Zip Archive server.zip (541 kb)


Verify By: Kashyap Parikh
HQ Fix Version: 3.1.3
Last comment: 55 weeks ago
Resolution Date: 01/Nov/07 01:26 PM


 Description  « Hide
It's possible in some situations for the server to be unable to reschedule metrics due to the transaction being marked for rollback. The secenario is as follows:

1. The agent gets out of sync with the metric schedule stored on the server. This could happen if the agent was down when the schedule was updated.
2. The agent reports in a batch of metrics that meets two criteria:
 - The batch causes a duplicate key violation in the metric tables causing us to insert 1 by 1
 - The batch is sent with SRNs that are out of date, forcing a reschedule

At least thats my initial thinking based on the stracktrace below:

2007-10-11 14:10:13,084 WARN [org.hibernate.util.JDBCExceptionReporter] SQL Error: 0, SQLState: null
2007-10-11 14:10:13,085 ERROR [org.hibernate.util.JDBCExceptionReporter] Transaction is not active: tx=TransactionImpl:XidImpl[FormatId=257, GlobalId=hqapp-0001.hi5.com/17278492, BranchQual=, localId=17278492]; - nested throwable: (javax.resource.ResourceException: Transaction is not active: tx=TransactionImpl:XidImpl[FormatId=257, GlobalId=hqapp-0001.hi5.com/17278492, BranchQual=, localId=17278492])2007-10-11 14:10:13,085 ERROR [org.hyperic.hq.measurement.server.mbean.ScheduleVerificationService] Failed to refresh schedule for entity: 1:14653
org.hibernate.exception.GenericJDBCException: Cannot open connection
        at org.hibernate.exception.SQLStateConverter.handledNonSpecificException(SQLStateConverter.java:103)
        at org.hibernate.exception.SQLStateConverter.convert(SQLStateConverter.java:91)
        at org.hibernate.exception.JDBCExceptionHelper.convert(JDBCExceptionHelper.java:43)
        at org.hibernate.exception.JDBCExceptionHelper.convert(JDBCExceptionHelper.java:29)
        at org.hibernate.jdbc.ConnectionManager.openConnection(ConnectionManager.java:426)
        at org.hibernate.jdbc.ConnectionManager.getConnection(ConnectionManager.java:144)
        at org.hibernate.jdbc.AbstractBatcher.prepareQueryStatement(AbstractBatcher.java:139)
        at org.hibernate.loader.Loader.prepareQueryStatement(Loader.java:1547)
        at org.hibernate.loader.Loader.doQuery(Loader.java:673)
        at org.hibernate.loader.Loader.doQueryAndInitializeNonLazyCollections(Loader.java:236)
        at org.hibernate.loader.Loader.loadCollection(Loader.java:1994)
        at org.hibernate.loader.collection.BatchingCollectionInitializer.initialize(BatchingCollectionInitializer.java:52)
        at org.hibernate.persister.collection.AbstractCollectionPersister.initialize(AbstractCollectionPersister.java:565)
        at org.hibernate.event.def.DefaultInitializeCollectionEventListener.onInitializeCollection(DefaultInitializeCollectionEventListener.java:60)
        at org.hibernate.impl.SessionImpl.initializeCollection(SessionImpl.java:1716)
        at org.hibernate.collection.AbstractPersistentCollection.initialize(AbstractPersistentCollection.java:344)
        at org.hibernate.collection.AbstractPersistentCollection.read(AbstractPersistentCollection.java:86)
        at org.hibernate.collection.PersistentBag.iterator(PersistentBag.java:249)
        at java.util.Collections$UnmodifiableCollection$1.<init>(Unknown Source)
        at java.util.Collections$UnmodifiableCollection.iterator(Unknown Source)
        at org.hyperic.hq.measurement.ext.depgraph.GraphBuilder.buildGraph(GraphBuilder.java:74)
        at org.hyperic.hq.measurement.ext.depgraph.GraphBuilder.buildGraph(GraphBuilder.java:59)
        at org.hyperic.hq.measurement.server.mdb.AgentScheduleSynchronizer.reschedule(AgentScheduleSynchronizer.java:166)
        at org.hyperic.hq.measurement.server.mdb.AgentScheduleSynchronizer.reschedule(AgentScheduleSynchronizer.java:264)
        at org.hyperic.hq.measurement.server.mbean.ScheduleVerificationService.refreshSchedule(ScheduleVerificationService.java:100)
        at sun.reflect.GeneratedMethodAccessor755.invoke(Unknown Source)
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(Unknown Source)
        at java.lang.reflect.Method.invoke(Unknown Source)
        at org.jboss.mx.interceptor.ReflectedDispatcher.invoke(ReflectedDispatcher.java:141)
        at org.jboss.mx.server.Invocation.dispatch(Invocation.java:80)
        at org.jboss.mx.server.Invocation.invoke(Invocation.java:72)
        at org.jboss.mx.server.AbstractMBeanInvoker.invoke(AbstractMBeanInvoker.java:245)
        at org.jboss.mx.server.MBeanServerImpl.invoke(MBeanServerImpl.java:644)
        at org.hyperic.hq.measurement.server.session.DerivedMeasurementManagerEJBImpl.reschedule(DerivedMeasurementManagerEJBImpl.java:1372)
        at sun.reflect.GeneratedMethodAccessor754.invoke(Unknown Source)
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(Unknown Source)
        at java.lang.reflect.Method.invoke(Unknown Source)
        at org.jboss.invocation.Invocation.performCall(Invocation.java:345)
        at org.jboss.ejb.StatelessSessionContainer$ContainerInterceptor.invoke(StatelessSessionContainer.java:214)
        at org.jboss.resource.connectionmanager.CachedConnectionInterceptor.invoke(CachedConnectionInterceptor.java:149)
        at org.jboss.ejb.plugins.StatelessSessionInstanceInterceptor.invoke(StatelessSessionInstanceInterceptor.java:154)
        at org.jboss.webservice.server.ServiceEndpointInterceptor.invoke(ServiceEndpointInterceptor.java:54)
        at org.jboss.ejb.plugins.CallValidationInterceptor.invoke(CallValidationInterceptor.java:48)
        at org.jboss.ejb.plugins.AbstractTxInterceptor.invokeNext(AbstractTxInterceptor.java:106)
        at org.jboss.ejb.plugins.TxInterceptorCMT.runWithTransactions(TxInterceptorCMT.java:335)
        at org.jboss.ejb.plugins.TxInterceptorCMT.invoke(TxInterceptorCMT.java:166)
        at org.hyperic.hq.application.HQApp$Snatcher.invokeNextBoth(HQApp.java:245)
        at org.hyperic.hq.application.HQApp$Snatcher.invokeNext(HQApp.java:281)
        at org.hyperic.txsnatch.TxSnatch.invoke(TxSnatch.java:71)
        at org.jboss.ejb.plugins.SecurityInterceptor.invoke(SecurityInterceptor.java:153)
        at org.jboss.ejb.plugins.LogInterceptor.invoke(LogInterceptor.java:192)
        at org.jboss.ejb.plugins.ProxyFactoryFinderInterceptor.invoke(ProxyFactoryFinderInterceptor.java:122)
        at org.jboss.ejb.SessionContainer.internalInvoke(SessionContainer.java:624)
        at org.jboss.ejb.Container.invoke(Container.java:873)
        at org.jboss.ejb.plugins.local.BaseLocalProxyFactory.invoke(BaseLocalProxyFactory.java:415)
        at org.jboss.ejb.plugins.local.StatelessSessionProxy.invoke(StatelessSessionProxy.java:88)
        at $Proxy306.reschedule(Unknown Source)
        at org.hyperic.hq.measurement.server.session.SRNManagerEJBImpl.reportAgentSRNs(SRNManagerEJBImpl.java:231)
        at sun.reflect.GeneratedMethodAccessor695.invoke(Unknown Source)
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(Unknown Source)
        at java.lang.reflect.Method.invoke(Unknown Source)
        at org.jboss.invocation.Invocation.performCall(Invocation.java:345)
        at org.jboss.ejb.StatelessSessionContainer$ContainerInterceptor.invoke(StatelessSessionContainer.java:214)
        at org.jboss.resource.connectionmanager.CachedConnectionInterceptor.invoke(CachedConnectionInterceptor.java:149)
        at org.jboss.ejb.plugins.StatelessSessionInstanceInterceptor.invoke(StatelessSessionInstanceInterceptor.java:154)
        at org.jboss.webservice.server.ServiceEndpointInterceptor.invoke(ServiceEndpointInterceptor.java:54)
        at org.jboss.ejb.plugins.CallValidationInterceptor.invoke(CallValidationInterceptor.java:48)
        at org.jboss.ejb.plugins.AbstractTxInterceptor.invokeNext(AbstractTxInterceptor.java:106)
        at org.jboss.ejb.plugins.TxInterceptorCMT.runWithTransactions(TxInterceptorCMT.java:335)
        at org.jboss.ejb.plugins.TxInterceptorCMT.invoke(TxInterceptorCMT.java:166)
        at org.hyperic.hq.application.HQApp$Snatcher.invokeNextBoth(HQApp.java:245)
        at org.hyperic.hq.application.HQApp$Snatcher.invokeNext(HQApp.java:281)
        at org.hyperic.txsnatch.TxSnatch.invoke(TxSnatch.java:71)
        at org.jboss.ejb.plugins.SecurityInterceptor.invoke(SecurityInterceptor.java:153)
        at org.jboss.ejb.plugins.LogInterceptor.invoke(LogInterceptor.java:192)
        at org.jboss.ejb.plugins.ProxyFactoryFinderInterceptor.invoke(ProxyFactoryFinderInterceptor.java:122)
        at org.jboss.ejb.SessionContainer.internalInvoke(SessionContainer.java:624)
        at org.jboss.ejb.Container.invoke(Container.java:873)
        at org.jboss.ejb.plugins.local.BaseLocalProxyFactory.invoke(BaseLocalProxyFactory.java:415)
        at org.jboss.ejb.plugins.local.StatelessSessionProxy.invoke(StatelessSessionProxy.java:88)
        at $Proxy311.reportAgentSRNs(Unknown Source)
        at org.hyperic.hq.measurement.server.session.ReportProcessorEJBImpl.handleMeasurementReport(ReportProcessorEJBImpl.java:179)
        at sun.reflect.GeneratedMethodAccessor674.invoke(Unknown Source)
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(Unknown Source)
        at java.lang.reflect.Method.invoke(Unknown Source)
        at org.jboss.invocation.Invocation.performCall(Invocation.java:345)
        at org.jboss.ejb.StatelessSessionContainer$ContainerInterceptor.invoke(StatelessSessionContainer.java:214)
        at org.jboss.resource.connectionmanager.CachedConnectionInterceptor.invoke(CachedConnectionInterceptor.java:149)
        at org.jboss.ejb.plugins.StatelessSessionInstanceInterceptor.invoke(StatelessSessionInstanceInterceptor.java:154)
        at org.jboss.webservice.server.ServiceEndpointInterceptor.invoke(ServiceEndpointInterceptor.java:54)
        at org.jboss.ejb.plugins.CallValidationInterceptor.invoke(CallValidationInterceptor.java:48)
        at org.jboss.ejb.plugins.AbstractTxInterceptor.invokeNext(AbstractTxInterceptor.java:106)
        at org.jboss.ejb.plugins.TxInterceptorCMT.runWithTransactions(TxInterceptorCMT.java:335)
        at org.jboss.ejb.plugins.TxInterceptorCMT.invoke(TxInterceptorCMT.java:166)
        at org.hyperic.hq.application.HQApp$Snatcher.invokeNextBoth(HQApp.java:245)
        at org.hyperic.hq.application.HQApp$Snatcher.invokeNext(HQApp.java:281)
        at org.hyperic.txsnatch.TxSnatch.invoke(TxSnatch.java:71)
        at org.jboss.ejb.plugins.SecurityInterceptor.invoke(SecurityInterceptor.java:153)
        at org.jboss.ejb.plugins.LogInterceptor.invoke(LogInterceptor.java:192)
        at org.jboss.ejb.plugins.ProxyFactoryFinderInterceptor.invoke(ProxyFactoryFinderInterceptor.java:122)
        at org.jboss.ejb.SessionContainer.internalInvoke(SessionContainer.java:624)
        at org.jboss.ejb.Container.invoke(Container.java:873)
        at org.jboss.ejb.plugins.local.BaseLocalProxyFactory.invoke(BaseLocalProxyFactory.java:415)
        at org.jboss.ejb.plugins.local.StatelessSessionProxy.invoke(StatelessSessionProxy.java:88)
        at $Proxy310.handleMeasurementReport(Unknown Source)
        at org.hyperic.hq.bizapp.server.session.LatherDispatcher.cmdMeasurementSendReport(LatherDispatcher.java:493)
        at org.hyperic.hq.bizapp.server.session.LatherDispatcher.dispatch(LatherDispatcher.java:724)
        at com.hyperic.hq.bizapp.server.session.LatherDispatcher.dispatch(LatherDispatcher.java:128)
        at com.hyperic.hq.bizapp.server.session.EELatherBossEJBImpl.dispatch(EELatherBossEJBImpl.java:58)
        at sun.reflect.GeneratedMethodAccessor669.invoke(Unknown Source)
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(Unknown Source)
        at java.lang.reflect.Method.invoke(Unknown Source)
        at org.jboss.invocation.Invocation.performCall(Invocation.java:345)
        at org.jboss.ejb.StatelessSessionContainer$ContainerInterceptor.invoke(StatelessSessionContainer.java:214)
        at org.jboss.resource.connectionmanager.CachedConnectionInterceptor.invoke(CachedConnectionInterceptor.java:149)
        at org.jboss.ejb.plugins.StatelessSessionInstanceInterceptor.invoke(StatelessSessionInstanceInterceptor.java:154)
        at org.jboss.webservice.server.ServiceEndpointInterceptor.invoke(ServiceEndpointInterceptor.java:54)
        at org.jboss.ejb.plugins.CallValidationInterceptor.invoke(CallValidationInterceptor.java:48)
        at org.jboss.ejb.plugins.AbstractTxInterceptor.invokeNext(AbstractTxInterceptor.java:106)
        at org.jboss.ejb.plugins.TxInterceptorCMT.runWithTransactions(TxInterceptorCMT.java:335)
        at org.jboss.ejb.plugins.TxInterceptorCMT.invoke(TxInterceptorCMT.java:166)
        at org.hyperic.hq.application.HQApp$Snatcher.invokeNextBoth(HQApp.java:245)
        at org.hyperic.hq.application.HQApp$Snatcher.invokeNext(HQApp.java:281)
        at org.hyperic.txsnatch.TxSnatch.invoke(TxSnatch.java:71)
        at org.jboss.ejb.plugins.SecurityInterceptor.invoke(SecurityInterceptor.java:153)
        at org.jboss.ejb.plugins.LogInterceptor.invoke(LogInterceptor.java:192)
        at org.jboss.ejb.plugins.ProxyFactoryFinderInterceptor.invoke(ProxyFactoryFinderInterceptor.java:122)
        at org.jboss.ejb.SessionContainer.internalInvoke(SessionContainer.java:624)
        at org.jboss.ejb.Container.invoke(Container.java:873)
        at org.jboss.ejb.plugins.local.BaseLocalProxyFactory.invoke(BaseLocalProxyFactory.java:415)
        at org.jboss.ejb.plugins.local.StatelessSessionProxy.invoke(StatelessSessionProxy.java:88)
        at $Proxy319.dispatch(Unknown Source)
        at sun.reflect.GeneratedMethodAccessor668.invoke(Unknown Source)
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(Unknown Source)
        at java.lang.reflect.Method.invoke(Unknown Source)
        at org.hyperic.lather.jboss.JBossLatherServlet$EJBCaller.doInvoke(JBossLatherServlet.java:318)
        at org.hyperic.lather.jboss.JBossLatherServlet$EJBCaller.run(JBossLatherServlet.java:354)
Caused by: org.jboss.util.NestedSQLException: Transaction is not active: tx=TransactionImpl:XidImpl[FormatId=257, GlobalId=hqapp-0001.hi5.com/17278492, BranchQual=, localId=17278492]; - nested throwable: (javax.resource.ResourceException: Transaction is not active: tx=TransactionImpl:XidImpl[FormatId=257, GlobalId=hqapp-0001.hi5.com/17278492, BranchQual=, localId=17278492])
        at org.jboss.resource.adapter.jdbc.WrapperDataSource.getConnection(WrapperDataSource.java:79)
        at org.hibernate.connection.DatasourceConnectionProvider.getConnection(DatasourceConnectionProvider.java:69)
        at org.hibernate.jdbc.ConnectionManager.openConnection(ConnectionManager.java:423)
        ... 132 more
Caused by: javax.resource.ResourceException: Transaction is not active: tx=TransactionImpl:XidImpl[FormatId=257, GlobalId=hqapp-0001.hi5.com/17278492, BranchQual=, localId=17278492]
        at org.jboss.resource.connectionmanager.TxConnectionManager.getManagedConnection(TxConnectionManager.java:331)
        at org.jboss.resource.connectionmanager.BaseConnectionManager2.allocateConnection(BaseConnectionManager2.java:462)
        at org.jboss.resource.connectionmanager.BaseConnectionManager2$ConnectionManagerProxy.allocateConnection(BaseConnectionManager2.java:894)
        at org.jboss.resource.adapter.jdbc.WrapperDataSource.getConnection(WrapperDataSource.java:73)
        ... 134 more


 All   Comments   Change History      Sort Order: Ascending order - Click to sort in descending order
Ryan Morgan - 12/Oct/07 02:10 PM
Assign to Charles.

Charles Lee - 22/Oct/07 08:01 AM
GraphBuilder is trying to access an unmodifiable collection that needed to be initialized. Move GraphBuilder into same package as Hibernate POJO so that we can use the modifiable collection.

Charles Lee - 22/Oct/07 08:39 AM
Tom is investigating problems related to the SRN

Tom Keeney - 22/Oct/07 05:17 PM
Backported changes from the trunk (3.2) to explicitly manage the txn. The measurement schedule synch occurring when handling the measurement report was failing when setRollbackOnly was set on the container managed txn. Now that the txns are managed explicitly, we rollback the txn and the measurement synch occurs in a separate container managed txn.

See the attached patch files (HHQ_1115patch.txt and HHQ_1115patch2.txt)

Tom Keeney - 22/Oct/07 05:24 PM
While testing the data manager txn fix, we have noticed other related exceptions on mongo. It appears that a TransactionRolledBackException is occurring when the server is trying to update the schedule on an agent where the connection is timing out. We need to modify the schedule update code to cache the down agents so that srn reports are not resent to them.

Tom Keeney - 23/Oct/07 03:32 PM
Caching the down agents is not necessary since we are only updating the SRNs for a single agent and that agent initiated the request when sending a measurement report. It is highly unlikely that reporting the SRNs to the agent would cause a txn timeout (even with a stuck back channel to the agent). The agent connection code includes a 10 sec timeout so that a stuck back channel can be recovered from. We will not be addressing this issue further since it is not reproducible again now that we have a fresh instance of 3.1 on mongo.

Kashyap Parikh - 01/Nov/07 09:13 AM
After running following test I am seeing lots of "Failed to refresh schedule for entity" errors in the log when I first tried to update the metric collection schedule. This was while all agents were up.

HQ Server:
CPU 2@2.4GHz
RAM 2GB
JVM mx, 512MB
JVM Total 350MB
HQ version: 3.1.3 #499

HQ Agent:
CPU Duo@2GHZ
RAM 2GB
20 agents, each monitoring 400 file services
Free Memory after all agents loaded: 250MB
HQ version: 3.1.3 #499

After loading the server with 20 agents I should see atleast 8000 services but I only see 1200 so at some point this failed (could be because of HQ-949).
See attached server log.

Kashyap Parikh - 01/Nov/07 01:26 PM
Closing based on following comments from Tom. The testing done to verify this bug is sufficient.

Tom: Yes, this is definitely an issue we should fix but it's not related to the bug originally reported for HQ-1115 where calling setRollbackOnly() in the data manager invalidated the txn so the container refused to provide the app with any more connections. We fixed HQ-1115 in 3.1.3.

Filed HHQ-1196 for the issue reported as a result of above mini performance test.