As there are many good ideas out there and so many bright minds, you just have to pull all this energy together mix it with your own ideas and you will create wonderful things.
A practical example would be the following story. A customer who likes simple things came up with an idea. Imagine you have a monthly maintenance window in your company where you are going to patch all your servers. As good practice you will put all your lovely servers into maintenance mode, great! After you applied your patches and you rebooted (?) your servers you probably want to check if all your servers rebooted, are up and running and healthy in SCOM.
Now the thing is, are you sure ALL your servers booted within your maintenance window? If you see a green agent in the SCOM console doesn’t guarantee you that the server rebooted. What if there are service desk guys who don’t know anything about SCOM and need a quick check to see if the server booted and is healthy, meaning available in SCOM.
Of course one approach could be creating rules which detect boot event id’s from each server and if the server does not generate such an event id it will throw an alert and so on, but honestly, don’t we have already enough alerts in SCOM ?
Inspired by the idea from the customer and by a blog post from this bright PowerShell guy Thiyagu here I wrote a PowerShell script which generates an HTML report. I adapted Thiyagu’s way of generating an HTML file in PowerShell, therefore all credits to him.
OK what is it all about…
1) First download the script from here
2) Run the Get-BootAvailabilityReport.ps1 providing 3 parameters
- uptimethreshold – This parameter defines how old is your maintenance window. Let’s say you finished patching and rebooting all your servers and every server is no longer up than 60 hours (in this example). If a server has an uptime more than 60 hours the server did not reboot.
- filename – Output path where you want to have the report file written e.g. c:\temp\report.html (must be a valid directory and a *.html file name)
- scom – SCOM management server which the script connects to.
3) After the script finished Internet Explorer opens and presents you a nice report
The report contains 6 columns:
- Server Name – This is the name of the server
- Last Boot Time – This is the time the server booted last
- Server Uptime – How long the server is up and running
- Server Uptime (Total Hours) – How long the server is up and running in hours
- Last Agent Health State – Last status of the agent health in SCOM
- Agent Available – Shows if the agent is available (=True) or e.g. grey (=False)
How to deal with it…
- The uptimethreshold parameter which you provide to the script is compared against the Server Uptime (Total Hours). If the Server Uptime (Total Hours) is greater than the uptimethreshold it will be colored red.
- If the Last Agent Health State has another result than Success it will be appear in red.
- If the Agent Available has another value than True it will be appear in red.
So that means if the last 3 column are colored green everything is ok. If Server Uptime (Total Hours) is red the server did not reboot. If Agent Available is colored red the SCOM agent is not available. The Last Agent Health State shows only the last health state the agent had in SCOM. This could be misleading, because if an agent is grey in SCOM and didn’t have a critical health state before it went grey this column will appear green.
I hope you get the idea behind and you like it as much I do.
The script uses the SCOM 2012 cmdlet’s and also WMI to query the servers. Make sure you have the necessary permissions and ports available for the script to run properly.
Download the script, have fun!