Discussion:
Down (Unplanned Maintenance) in reports although health ok in Cons
(too old to reply)
j.g.
2009-11-23 19:12:02 UTC
Permalink
Most of our SQL servers are reporting "Down (Unplanned Maintenance)" in
availability reports although they are healthy according to the console.

This false downtime started reporting after we had to remove the SQL MPs
because they were causing performance issues in our environment (maxed out
CPU on servers with large numbers of databases and removing the packs was
quicker than manually disabling all the monitors).

Performance data is collected, alerts are coming in for the servers, but for
some reaon the the warehouse is convinced they are down. Reset/recalculate
health doesn't help. Placing the agents in Maintenance Mode for 5+ minutes
didn't help. Removing/reinstalling agents did not help.

Does anyone have any suggestions on how to reset the agent health state in
the Warehouse and/or have a link to a documented procedure to delete all
traces of an agent in the databases (so that we can reinstall the agents and
rebuild correct state data in the warehouse)
j.g.
2009-11-23 20:28:02 UTC
Permalink
I may have tracked down the cause of this myself. I checked the
dbo.MaintenanceMode and dbo.HealthServiceOutage for rows where EndDateTime
was NULL and found a large number of entries whose StartDateTime was the day
and time we removed the SQL packs.

I'll update when I have tested a few potential solutions.
Post by j.g.
Most of our SQL servers are reporting "Down (Unplanned Maintenance)" in
availability reports although they are healthy according to the console.
This false downtime started reporting after we had to remove the SQL MPs
because they were causing performance issues in our environment (maxed out
CPU on servers with large numbers of databases and removing the packs was
quicker than manually disabling all the monitors).
Performance data is collected, alerts are coming in for the servers, but for
some reaon the the warehouse is convinced they are down. Reset/recalculate
health doesn't help. Placing the agents in Maintenance Mode for 5+ minutes
didn't help. Removing/reinstalling agents did not help.
Does anyone have any suggestions on how to reset the agent health state in
the Warehouse and/or have a link to a documented procedure to delete all
traces of an agent in the databases (so that we can reinstall the agents and
rebuild correct state data in the warehouse)
j.g.
2009-11-26 20:06:01 UTC
Permalink
I updated the tables mentioned below and set a valid EndDatetime and
DWLastModifiedDateTime for the rows in question.

It takes a while, but SCOM eventually recalculates the hourly and daily
aggregates and the reports start showing correct data.

I found absolutely no documentation for this "solution". It is my own best
guess solution after digging through data in the DW. If you chose to try
yourself, be aware that you are doing so entirely at your own risk.
Post by j.g.
I may have tracked down the cause of this myself. I checked the
dbo.MaintenanceMode and dbo.HealthServiceOutage for rows where EndDateTime
was NULL and found a large number of entries whose StartDateTime was the day
and time we removed the SQL packs.
I'll update when I have tested a few potential solutions.
Post by j.g.
Most of our SQL servers are reporting "Down (Unplanned Maintenance)" in
availability reports although they are healthy according to the console.
This false downtime started reporting after we had to remove the SQL MPs
because they were causing performance issues in our environment (maxed out
CPU on servers with large numbers of databases and removing the packs was
quicker than manually disabling all the monitors).
Performance data is collected, alerts are coming in for the servers, but for
some reaon the the warehouse is convinced they are down. Reset/recalculate
health doesn't help. Placing the agents in Maintenance Mode for 5+ minutes
didn't help. Removing/reinstalling agents did not help.
Does anyone have any suggestions on how to reset the agent health state in
the Warehouse and/or have a link to a documented procedure to delete all
traces of an agent in the databases (so that we can reinstall the agents and
rebuild correct state data in the warehouse)
Loading...