Support Migration Notice: To update migrated JIRA cases click here to open a new case use www.vmware.com/go/sr | vFabric Hyperic 5.7.0 is Now Available

Hyperic HQ

Implement Backfiller Startup "smart" logic

Details

  • Type: Improvement Improvement
  • Status: Reopened Reopened
  • Priority: Major Major
  • Resolution: Unresolved
  • Affects Version/s: 4.5
  • Fix Version/s: None
  • Component/s: Alerts
  • Case Links:
    none
  • Regression:
    No
  • Story Points:
    3

Description

Need to find a way to deterministically know when it is acceptable to start the backfiller.

  1. hqstats-09-28.ods
    28/Sep/10 3:51 PM
    235 kB
    Dharma Srinivasan
  2. server.log.2010-09-28.gz
    28/Sep/10 3:51 PM
    539 kB
    Dharma Srinivasan

Activity

Hide
Patrick Nguyen added a comment -

FIX: Start the backfiller after the initial wait time only if the availability inserter is not "backlogged" (queue size < 1000)

Show
Patrick Nguyen added a comment - FIX: Start the backfiller after the initial wait time only if the availability inserter is not "backlogged" (queue size < 1000)
Hide
Dharma Srinivasan added a comment -

Tested with build #112 (Linux, MySQL, 1327 platforms) and found that backfiller starts before the entire queue is empty and hence alerts are fired for all the agents for during the time the server was down.

Steps/explanation:

1.Stop hq-server #105 ~10:30 am
2.Start it after 1 hour
3.Stop it within the next ~5 mins
4.Upgrade server to #112
5.Start upgraded server (now 2 hours since it was first brought down) ~12:30 pm

The server stats and logs show that backfiller starts at 1:03 and immediately alerts are fired for hundreds of agents.
The availability queue size does decrease aorund this time but keeps spiking and falling down for a few more minutes before its stable at a lower value.

As discussed with Patrick and Scott, avail q size falling to lesser than 1000 does not seem to be a good indication to start backfiller.
Re-opening for change of logic in backfiller start time.

Attached hq-stats has charts (in separate sheet) for correlation and the time/row at which backfiller starts is marked in red.

Show
Dharma Srinivasan added a comment - Tested with build #112 (Linux, MySQL, 1327 platforms) and found that backfiller starts before the entire queue is empty and hence alerts are fired for all the agents for during the time the server was down. Steps/explanation: 1.Stop hq-server #105 ~10:30 am 2.Start it after 1 hour 3.Stop it within the next ~5 mins 4.Upgrade server to #112 5.Start upgraded server (now 2 hours since it was first brought down) ~12:30 pm The server stats and logs show that backfiller starts at 1:03 and immediately alerts are fired for hundreds of agents. The availability queue size does decrease aorund this time but keeps spiking and falling down for a few more minutes before its stable at a lower value. As discussed with Patrick and Scott, avail q size falling to lesser than 1000 does not seem to be a good indication to start backfiller. Re-opening for change of logic in backfiller start time. Attached hq-stats has charts (in separate sheet) for correlation and the time/row at which backfiller starts is marked in red.

People

Vote (0)
Watch (0)

Dates

  • Created:
    Updated:
    Last comment:
    3 years, 29 weeks, 4 days ago