History | Log In     View a printable version of the current page.  
HQ 4.0 EE Release is Now Available | HQ 3.2.5-EE Maintenance Release is Now Available
Issue Details (XML | Word | Printable)

Key: HHQ-1070
Type: Bug Bug
Status: Closed Closed
Resolution: Fixed
Priority: Critical Critical
Assignee: Ryan Morgan
Reporter: Ryan Morgan
Votes: 0
Watchers: 0
Operations

If you were logged in you would be able to see more operations.
Hyperic HQ

Agent connection hang

Created: 09/Sep/07 12:41 PM   Updated: 23/Oct/07 10:14 PM
Component/s: Agent
Affects Version/s: 3.1.0, 3.0.5, 3.1.1
Fix Version/s: 3.2.0, 3.1.4

Verify By: Kashyap Parikh
Last comment: 56 weeks, 2 days ago
Resolution Date: 12/Oct/07 07:44 AM


 Description  « Hide
If the agent is not responding correctly to requests, it's possible for agent connections from the server to hang forever. This is common when doing mass operations where many agents need to be contacted, such as changing metric templates globally.

Need to investigate if setting a connection timeout would solve this problem.

The full thread dump:

"http-0.0.0.0-7080-177" daemon prio=10 tid=0x00002aab782abc00 nid=0x4274 runnable [0x000000005afcc000..0x000000005afd6c20]
   java.lang.Thread.State: RUNNABLE
        at java.net.SocketInputStream.socketRead0(Native Method)
        at java.net.SocketInputStream.read(SocketInputStream.java:129)
        at com.sun.net.ssl.internal.ssl.InputRecord.readFully(InputRecord.java:293)
        at com.sun.net.ssl.internal.ssl.InputRecord.read(InputRecord.java:331)
        at com.sun.net.ssl.internal.ssl.SSLSocketImpl.readRecord(SSLSocketImpl.java:789)
        - locked <0x00002aaaefef7a80> (a java.lang.Object)
        at com.sun.net.ssl.internal.ssl.SSLSocketImpl.performInitialHandshake(SSLSocketImpl.java:1096)
        - locked <0x00002aaaefef7a28> (a java.lang.Object)
        at com.sun.net.ssl.internal.ssl.SSLSocketImpl.startHandshake(SSLSocketImpl.java:1123)
        at com.sun.net.ssl.internal.ssl.SSLSocketImpl.startHandshake(SSLSocketImpl.java:1107)
        at org.hyperic.hq.bizapp.agent.client.SecureAgentConnection.getSSLSocket(SecureAgentConnection.java:95)
        at org.hyperic.hq.bizapp.agent.client.SecureAgentConnection.getSocket(SecureAgentConnection.java:132)
        at org.hyperic.hq.agent.client.AgentConnection.sendCommandHeaders(AgentConnection.java:153)
        at org.hyperic.hq.agent.client.AgentConnection.sendCommand(AgentConnection.java:121)
        at org.hyperic.hq.measurement.agent.client.MeasurementCommandsClient.getMeasurements(MeasurementCommandsClient.java:135)
        at org.hyperic.hq.measurement.agent.client.AgentMonitor.getLiveValues(AgentMonitor.java:218)

 All   Comments   Change History      Sort Order: Ascending order - Click to sort in descending order
Nipuna Bhayani - 10/Sep/07 09:54 AM
Upgraded to blocker

Ryan Morgan - 11/Sep/07 09:06 AM
Downgrade to critical, move into 3.2.

Setting the connection timeout can be risky for operations that may take some time to complete. Setting a timeout of a minute or two should be acceptable, but needs more review and should not be put into 3.1 with only a week to release.

Ryan Morgan - 12/Oct/07 07:44 AM

This is fixed in 3.2. The server now sets a read timeout when doing the initial SSL handshake.

Kashyap Parikh - 23/Oct/07 10:14 PM
Verified updating metric template for 3 platforms and it worked fine.