Thursday, July 9, 2009

ITM6 : Take Action : Remount Stale remote filesystems


 

Case statement

Methodology

Situation #1 : Detect "Stale" remote FS

Situation #2 : Remount Situation

Formula Conditions

Take Action


 

Case statement

An ITM6 Unix (UX) or Linux (LZ) agent detects that its remote filesystems are unavailable, a "stale" connection.

Automate an action to remount it if possible

Methodology

Two (2) ITM situations are created where

  1. detects the mount point has become stale
  2. the other is triggered by the "correlated situation" condition of #1 being tr

Situation #1 : Detect "Stale" remote FS

Simple enough - if "Space Available" fails collection, there is an issue



 

Situation #2 : Remount Situation

Formula Conditions

Use the "Situation Comparison" for a condition against the situation above


Take Action

Note: the lines are strung together on one line in the Take Action field, to make it more legible here, newlines are after the semicolons

f="&{Linux_Disk.Mount_Point}" ;
u=`umount -f $f 2>&1 && echo $f`;
m=`mount $f 2>&1 && echo $f`;
echo -e "umount: $u\nmount:$m" | mail -s "ITM ACTION: Remount $f" junkmail@JdsMedia.net


 

ITM Host Availability (ping attribute)

Synopsis

A nice feature which isn't publicized too much in ITM for Linux and Unix agents is the ping capability. They're refrerred to as "Host Availability" for Linux and "Ping Attributes" for Unix Both will accept an input file containing a list of servers to "ping", and both will return status and response time.

Enabling the ping hosts file

Linux

$CANDLEHOME/config/lz.ini: KLZ_PINGHOSTLIST=<path_to_list>

# e.g. KLZ_PINGHOSTLIST=$CANDLEHOME/config/my_hostlist

# Add this to the lz.config if you want to avoid having to reconfig the agent

Unix

$CANDLEHOME/config/ux.ini: KUX_PINGHOSTLIST=<path_to_list>

# e.g. KUX_PINGHOSTLIST=$CANDLEHOME/config/my_hostlist

# Add this to the lz.config if you want to avoid having to reconfig the agent

Extra feature difference

One minor, but important, difference is that the Linux component will only ping servers in the list, while the Unix component has the additional feature using a situation to ping any target host from any managed system. This makes every managed system (agent) a ping source to target critical servers.

Attribute Group: UNIX Ping

Situation Definitions: System_Name == $NODE$, Target_Host == webserver1

Wednesday, July 8, 2009

IBM Tivoli Monitoring Product Codes

Obtaining Product Codes for IBM Tivoli Monitoring (ITM v6)

Local method:

Parse the proddsc.tbl file on a UNIX/Linux system to get the list by doing this:


 

UNIX

awk -F\| '/^[^*#]/ {print $1,$2}' ${CANDLEHOME}/registry/proddsc.tbl | sort | uniq


 

Or, IBM's site http://www-01.ibm.com/support/docview.wss?rs=2366&context=SSZ8F3&dc=DB520&dc=DB560&uid=swg21265222&loc=en_US&cs=UTF-8&lang=en&rss=ct2366tivoli

ITM Logs Timestamp Conversion

ITM v6 log files use a hexadecimal timestamp (to save space? who the hell knows), which adds unnecessary effort when the reason you're looking at the logs is to determine an issue in the first place. In any case... Here's the script I wrote when I first encountered the nonsense in ITM v6 logs a few years ago:

#!/bin/perl


 

foreach (<STDIN>) {

if (/^[^\s\d\w]+([\w\d]*)/) {

@t=localtime(hex($1));

$time=sprintf("%02d:%02d:%02d %02d/%02d/%04d",

$t[2],$t[1],$t[0],$t[4]+1,$t[3],$t[5]+1900);

s/^[^\s\w\d]+[\w\d]*/$time/;

}

print $_;

}

Here's a one-liner that Venkat.Saranathan at Gulfsoft.com cranked out, rending my script pretty much obsolete

perl -lane 'if ($_ =\ /^(.)([\dA-F]+)(\..*)/) { printf "%s%s%s", $1, scalar(localtime(oct("0x$2"))),$3; }'

ITM6 : Take Action : Remount Stale remote filesystems


 

Situation

An ITM6 Unix (UX) or Linux (LZ) agent detects that its remote filesystems are unavailable, a "stale" connection.

Automate an action to remount it if possible

Methodology

Two (2) ITM situations are created where

  1. detects the mount point has become stale
  2. the other is triggered by the "correlated situation" condition of #1 being tr

Situation #1 : Detect "Stale" remote FS

Simple enough - if "Space Available" fails collection, there is an issue



 

Situation #2 : Take Action Script - Stick this following script in your take action

Note: the lines are strung together on one line in the Take Action field, to make it more legible here, newlines are after the semicolons

f="&{Linux_Disk.Mount_Point}" ;
u=`umount -f $f 2>&1 && echo $f`;
m=`mount $f 2>&1 && echo $f`;
echo -e "umount: $u\nmount:$m" | mail -s "ITM ACTION: Remount $f" junkmail@JdsMedia.net


 

About Me

My photo
Been in the IT world since `92 (man, has it been over 16yrs already) and taken a real shine to scripting and web interfaces in 96 when someone showed me their HTML profile in Netscape. Most of my career has been involved with automation and sysadmin scripts in distributed environments. In 1998 became certifiable (sic) Tivoli Enterprise consultant; which focused on all aspects of my skill set, and kept me damned busy... still. My Tivoli consulting came through a couple of strong outfits { Crestone and Gulf Breeze } but am now working for myself as JDS Media. Not really a media company, but I liked the domain years ago.