A while ago there was an issue at a customers side where I had to provide a solution for restarting a Windows service depending on a Windows event log entry. Well this is nothing special and we SCOM guys do it almost every day. The fun part was, that it was a custom service and this custom service contained several depending services. The goal was to restart the main service in that way that the depending services had the same running state as they had before the main service was restarted.
An example, the service SvcMain (startup type “manual”, status = “running”) has two depending services SvcSub1 (startup type “manual”, status = “running”) and SvcSub2 (startup type “manual”, status = “stopped”) .
Now a Windows event occurs (e.g. Event ID 1234) the recovery task should trigger and should restart the SvcMain and its depending services SvcSub1 and SvcSub2 in a way that they have the same running status as before. That means SvcMain must be running, SvcSub1 has to be started and SvcSub2 has to be stopped.
To visualize what I mean by depending services let’s have a look at the “DHCP Client” service…
has the following depending services “Network Location Awarness” and “WinHTTP Web Proxy Auto Discovery Service”…
Everything o.k.? So, I thought I will make this script available for you guys.
Let’s first start with the event monitor…
Create Event Monitor
Start the the wizard for a manual reset event monitor…
Give it a name and choose an appropriate class…
Choose the event log in my case “Application”…
Set the “Event ID” and “Event Source” you are looking for…
Select how the health state of the monitor should appear…
Choose if you want to receive an alert or not…
Recovery Task
Search for the monitor and modify on the “Diagnostic and Recovery” tab under the “Configure recovery tasks” section by clicking “Add…”…
Choose “Run Script”…
Enter a name and click “Next”…
Set the “File Name” for the script (any name is possible, just make sure you set a *.vbs extension) then click “Edit in full screen…”
and copy/paste this script here. Make sure you define the service name of the Windows service which you want to restart in the script file…
Click create and close all windows…
Well that’s it. Next time if your specified event occurs, the recovery task will be triggered and your services will be restarted.
Have fun…
Hi. This script didn’t run for me.
Line 24, char 111
error:unterminated string constant
code: 800a0409
Hi Sonja
You might have a typo or bad character in the script, check http://www.computerperformance.co.uk/Logon/code/code_800A0409.htm .
Try to execute the script in the console and see if you hit an error.
Cheers,
Stefan
Hello do you know if there is a way to run a report on how many times the auto recovery task ran? If I target a windows service and set it to auto resolve how can I tell how many times the service was auto resolved
Hi Teresa
There might be different ways, you could extend the script to write a logfile or you make the script to write a Windows Event log entry and the create a rule which collects These Windows Events. Just some thoughts…
Cheers,
Stefan
Hello,
When completing this custom recovery task, I am getting the following error.
Date: 2013-04-22 12:50:52 PM
Application: Operations Manager
Application Version: 7.0.9538.0
Severity: Error
Message:
: Verification failed with 1 errors:
——————————————————-
Error 1:
Found error in 1|SystemCenterCustomMP|1.0.0.0|MomUIGenaratedRecovery08551b16474047acaaa6bc11326e83e4|| with message:
Target class Microsoft.Windows.Computer for Recovery MomUIGenaratedRecovery08551b16474047acaaa6bc11326e83e4 does not derive from Target class Microsoft.Windows.Server.Computer of the monitor (UIGeneratedMonitoraf180643cbd341f9bd053b0cc50b2c80) that this recovery is assigned to.
——————————————————-
: Target class Microsoft.Windows.Computer for Recovery MomUIGenaratedRecovery08551b16474047acaaa6bc11326e83e4 does not derive from Target class Microsoft.Windows.Server.Computer of the monitor (UIGeneratedMonitoraf180643cbd341f9bd053b0cc50b2c80) that this recovery is assigned to.
Also, when assigning a computer name, what is a valid entry? IP? FQDN?Computer Name?
I am currently using the FQDN.
Hi
You don’t Need to assign a computer name becasue the script runs on the computer where you want to restart the service. A “.” means the local server.
Cheers,
Stefan
Hi
Try to create follow my instructions and use in both cases the target Windows Computer class.
Stefan
Hi ScomFaq,
I cannot download the script, so i tried a simple script myself. It works fijne:
Const strServiceName = “Informatica9.1.0”
Set oShell = CreateObject(“Shell.Application”)
If Not oShell.IsServiceRunning(strServiceName) Then
oShell.ServiceStart strServiceName, False
End If
Just one question, when i tried tree times, it must stop with trying, how do i do this?
Kind regards,
André Borgeld
Nice article, thank you for posting! Do you have any concerns or things to watch out for when running recovery tasks? I was cautioned by my SCOM PFE to avoid them but I don’t remember why.
Hi
Well, I think it is because SCOM is meant to monitor and not fix things. In todays world you would use Orchestrator to execute recovery tasks. I try also to avoid recovery Tasks (they are hard to find, additional “load” etc.), I use them only if there is no other way.
Cheers,
Stefan
Well if you build in Powershell instead of WMI/VBS then it would be as powerfull as Orchestrator.
And monitoring has to be reactive and proactive.
Yes, correct BUT why should we have scripts distributed in SCOM and maybe in in other places of our infrastructure which solve errors if we can have it centralized in Orchestrator? In my opinion Orchestrator makes more sense…
With recovery tasks you can be proactive if you do it right. So it’s a must in a good monitoring system
Hello,
Small question, when you want to use a recovery script in scom (so run a script and not run a command), the script must be in vbs or you can use a powershell script ?
Thanks in advance.
Hi
It is possible to run PowerShell but not through the GUI. You would need the Authoring console or VSAE to accomplish this.
Cheers,
Stefan
Is the command to run for the recovery located on the server that triggers the error condition or on the scom servers. And does it run it locally on the scom server or on the server that has the error?
Hi Stefan;
Thanks for Great article.
I configured the Monitor in SCOM to monitor a log file for modified time is <10 minutes else generate alert. which works fine.
Also configured recovery task to restart the application service and wrote a vb script for the same. Script work fine when run it with elevated privilege but on normal CMD prompt it give access denied error.
I am not sure SCOM 2012 R2 will run recovery task script under elevated privilege or not. As my recovery script is not running successfully.( Service suppose to restart does not)
Also where I can find the logs to check if recovery task run properly in SCOM.
regards;
Nilesh Gavali.
Hi Nilesh
In general when an MP script fails for whatever reason, you should see some error in the Operations Manager event log.
Cheers,
Stefan