Script Xplat

SCOM 2012 – Linux Two-State Monitor with “Script in Script”

October 21, 2012 stefanroth Comments(16)

If you are a Windows guy and you haven’t touched Linux so far implementing SCOM will force you to get hands on *nix systems. Here I would like to provide a cool, little way how to overcome a limitation of the Unix/Linux Shell Command Two (or Three) State Monitor.

This two state monitor allows you to call a shell script or a one-line command sequence (using pipeline operators). That means you just can call one command “one-liner” using the pipe symbol “|” e.g. ls –l /tmp | wc –l . This example will count the files/directories in the /tmp directory. The “ls –l” command is similar to the Windows “dir” command and then the output is sent to the “wc –l” command which counts the words by line (wc=word count). But the real world is that most scripts on the Linux side are not just one-liner. A Linux guy might creates a script or he might asks you if you can execute a script which calls another script in Linux. Sounds complicated? No, I show you…

In the /tmp directory I created two text files countfile.sh and runcount.sh (don’t get confused about the .sh ending, these are just plain text files).

The countfile.sh has two lines…

#!/bin/bash => Which shell executes this script
ls –l /tmp | wc –l | bc => Will count the files in the /tmp directory. I just added the “bc” command which is used to convert to an integer value. But for this example you would not need it.

The runcount.sh file has also two lines…

Notice here the line…

. /tmp/countfile.sh => This line calls the countfile.sh file AND returns the output in the same shell. The “.” (dot) makes this possible, if you don’t use it the command would execute the countfile.sh in a separate shell and you would not be able to catch the value.

Next we need to make these scripts executable and readable. How do we do that? We set the permission of the files to read and execute using the “chmod” command. You must set these permissions or you won’t be able to run the scripts.

You can no check if it works by executing the /tmp/runcount.sh script…

We have 20 files and directories in the /tmp directory! Cool!

Now lets build the monitor…

Give it a name and choose a class in my case “SUSE Linux Enterprise Computer”…

For this testing I just create choose to run every 30 seconds (choose a higher interval for production! e.g. 30 minutes)…

Next you need to provide the shell command…

If there are more than 10 files in the directory throw an error alert…

And if there are equal or less than 10 files the monitor will be healthy…

We leave it the way it is…

Adjust the alert settings to your needs…

Notice here I added the line

$Data/Context///*[local-name()=”StdOut”]$

in the description field. This contains the output from the script in our case the file/folder count.

After a short time you receive an error if the threshold is reached…

and the alert properties…

In this example I showed you how to

create simple shell scripts
how to call a shell script from a shell script
how to use the “script in script” in SCOM

This will help you to overcome the “one-liner” limitation and the limitation to just execute one script.

If you are in the situation where you need to monitor Linux systems, I always try to force the Linux guys to build all the logic into their scripts and just return the values of the monitored state. On the SCOM side I am just calling the script and make the corresponding mapping to their scripts. E.g. if the script output is “0” and means unhealthy and “1” means healthy I map this to the two state monitor. You also could you words like “NOK” or “OK” for unhealthy and healthy state.

I hope you find this useful Smile …

stefanroth

http://www.stefanroth.net

16 Replies to “SCOM 2012 – Linux Two-State Monitor with “Script in Script””

Jonathan Almquist says:

November 4, 2012 at 23:42

Great blog you have here – very informative. Glad I stumbled onto it.

Reply
1. scomfaq says:
  
  November 4, 2012 at 23:45
  
  Hi Jonathan
  
  Thank you very much! It’s an honor to have you reading my blog!
  
  Regards,
  
  Stefan
  
  Reply
施安宏 says:

December 21, 2012 at 14:41

Hi , can i detect a log file and run command, not use timer ? thx.

Reply
1. scomfaq says:
  
  January 2, 2013 at 22:01
  
  Hi
  
  Sorry, I am not quite sure what you mean. Can you give me some more details?
  
  Regards,
  
  Stefan
  
  Reply
Arthur says:

May 29, 2013 at 12:21

Hello Stefan,

I have followed your post for another script bash. The monitor is added but the state stay green on this context “The monitor has been initialized for the first time or it has exited maintenance mode”. And never change status …

Reply
Prakash Bhimji says:

July 1, 2013 at 10:23

Hi Stefan,

I wonder if you can please advise, i have followed your article (Which is great by the way), but i am not receiving any SCOM Alerts. if i run the command on the Solaris Server it return a value of 9 (which is correct), my Error Expression is set to alert if there are more than 5 files in the folder. the taget group is set to Solaris 10 Computers. is there something that i am missing?

Kind Regards
Prakash

Reply
cloudwrap says:

July 28, 2013 at 05:44

Good post – nice to see someone really getting to grips with Linux monitoring. It’s becoming more important in the enterprise space and this kind of work adds credibility to the value of SCOM in that mix.

Reply
1. scomfaq says:
  
  July 29, 2013 at 10:43
  
  Hi,
  
  Thank’s for your comment. As Microsoft drives more and more into this direction I think it is very important to have a common understanding of “both” worlds.
  
  Cheers,
  
  Stefan
  
  Reply
user_feo says:

September 17, 2013 at 21:22

I want to setup something similar for one of our linux server, but I want to know if there are any error in a log. I have used a script and actual command but it does not return an error when I test it.
command: tail -n30 /tmp/log.070113 |awk ‘/error/’
Any suggestions?

Reply
Fabian says:

October 17, 2013 at 11:09

Hello,

is there any way to debug a monitor? My Script works fine and SCOM is able to execute it as i defined a Task to test this and everything works fine. Still my monitor doesn’t generate an event, even though it should. So my question is, how can I debug my scom monitor? I want to see what it does excactly to solve my problem.

Greetings,

Fabian

Reply
Tejas says:

October 23, 2014 at 16:56

I have similar issue. My monitor is not changing status.

Reply
Jean-Paul says:

January 8, 2015 at 09:21

Hallo! Nachdem wir nun unsere SCOM Umgebung soweit ausgebaut haben das die Infrastruktur überwacht wird, möchten wir gerne diverse Prozesse überwachen. Da wir bereits sehr viel Monitoring über Skripte betreiben und somit viele Dinge im Linux Bereich prüfen, bin ich auf diesen Blog gestossen.

Meine erstes Problem liegt nun darin das ich das o.g. Template nicht finde. Wurde der Monitor nicht mit Standard Mittel gebaut? In SCOM (2012R2 UR3) zeigt er mir nicht die ganzen Optionen an!

Danke für Ihre Hilfe. 🙂

Reply
1. Stefan Roth says:
  
  January 8, 2015 at 16:59
  
  Hallo
  
  Sie müssen von der SCOM Source noch das Microsoft.Unix.ShellCommand.Library.mpb Management Pack importieren, dann sollte das Template sichtbar sein.
  Es gibt noch weitere solche MP’s die MP Templates für UNIX Systeme bereitstellen:
  Microsoft.Unix.Process.Library.mpb => UNIX Prozesse überwachen
  Microsoft.Unix.Logfile.Library.mpb => UNIX Logfiles Monitoren
  
  Gruss
  
  Stefan
  
  Reply
Robert says:

April 24, 2015 at 08:46

This monitor didn’t work for my environment. In my case the threshold was 16 – the Alert comes up when the value was greater than 16 and less than 10. So i found out, that the Datatype of the StdOut was “String” and not “Integer” in the MP!

To solve the Problem you have to export the Override MP in which the Monitor is saved and search for the “StdOut” String in the “HealthyExpression” Block. Five lines under the “StdOut” you’ll find a “ValueExpression” Block with the Value Type=”String”. Replace the “String” with “Integer” or the Datatype you prefer. You have to do the same procedure with the “ErrorExpression” Block.

Now the StdOut Part of the HealthyExpression Block should look like this:

//*[local-name()=”StdOut”]

LessEqual

16

And the Part of the ErrorExpression:
SimpleExpression>

//*[local-name()=”StdOut”]

Greater

16

16 is the value i’ve entered in the Monitoringproperties.

Reply
Santhosh M says:

July 22, 2015 at 17:53

Hi Stefan,

The article was very helpful for creating custom monitor for Linux. I have created one for checking the logical partition space for the linux server. But the monitor status is on all the times, and it not getting change to error state.

“The monitor has been initialized for the first time or it has exited maintenance mode”

Could you please help me to sort out the issue

Reply
Max2014 says:

November 3, 2016 at 14:29

Hi Stefan, your blog it so good !!!! I would like ask you , if you can help me with a script for monitoring mail queue postifx. My
colleague give me a syntax command line , like this ” postqueue -p | tail -n 1 | cut -d’ ‘ -f5 ” … but i don’t know how I can use it. Please help me with two state generic script. thank you so much.

Reply

SCOM 2012 – Linux Two-State Monitor with “Script in Script”

Like this:

Related

16 Replies to “SCOM 2012 – Linux Two-State Monitor with “Script in Script””

Leave a Reply to Robert Cancel reply

Follow me on Twitter

Azure User Group Bern

Recent Comments

Blog Award

MVP Taxi

Youtube Channel

Buy my book

Teilen mit:

Like this:

Related

Related Articles

SCOM 2012 – JEE Application Availability Monitor Template

Teilen mit:

Like this:

PowerShell – Remote Desktop Cmdlets “A Remote Desktop Services deployment does not exist…”

Teilen mit:

Like this:

SCOM 2012 – Export Grey Agents using Powershell

Teilen mit:

Like this:

16 Replies to “SCOM 2012 – Linux Two-State Monitor with “Script in Script””

Leave a Reply to Robert Cancel reply