正在浏览标签为 nagios-plugins 的文章

新买的dell 服务器默认情况下都带有 omsa  服务器管理程序安装盘,linux,windows各个系统的版本都有,安装起来也比较方便。在之前本站已经介绍过了利用dell yum源安装omsa的方法。安装好omsa后程序会提供https的1311端口供用户通过浏览器登陆管理页面,查看哪些硬件出了问题。但如果服务器成百上千台,这样管理的话非常费时费力。下面介绍利用nagiso的omsa插件来监控dell服务器的硬件情况。

下载插件
登陆http://folk.uio.no/trondham/software/check_openmanage.html 下载最新版本

# wget http://folk.uio.no/trondham/software/files/check_openmanage-3.5.1.tar.gz
# tar -xvzf  check_openmanage-3.5.1.tar.gz
# ll check_openmanage-3.5.1
total 3524
-rw-r--r-- 1 45150 55150   15379 Oct 22 17:39 CHANGES
-rwxr-xr-x 1 45150 55150  133625 Oct 22 17:09 check_openmanage
-rw-r--r-- 1 45150 55150   24065 Oct 22 17:09 check_openmanage.8
-rwxr-xr-x 1 45150 55150 3290403 Oct 22 17:12 check_openmanage.exe
-rw-r--r-- 1 45150 55150    5304 Oct 22 17:09 check_openmanage.php
-rw-r--r-- 1 45150 55150   16269 Oct 22 17:09 check_openmanage.pod
-rw-r--r-- 1 45150 55150    3988 Oct 22 17:09 check_openmanage.spec
-rw-r--r-- 1 45150 55150   35147 Oct 22 17:09 COPYING
-rw-r--r-- 1 45150 55150     533 Oct 22 17:09 INSTALL
-rwxr-xr-x 1 45150 55150     406 Oct 22 17:09 install.bat
-rwxr-xr-x 1 45150 55150    1082 Oct 22 17:09 install.sh
-rw-r--r-- 1 45150 55150    2727 Oct 22 17:09 README

压缩包中有2个版本的程序分别是for linux和for windows
安装
check_openmanage组件不用编译,解压后就可以使用,但有3个前提:
1、要监控的服务器一定是dell服务器
2、被监控的服务器一定先安装好dell 的omsa程序
3、nagiso已经安装完毕并正常运行

将check_openmanage 复制到nagios的插件目录
# cp check_openmanage-3.5.1/check_openmanage

check_openmanage 有两种方式获得dell服务器硬件信息,分别为本地运行获得和通过snmp方式获得。由于我在linux通过yum安装omsa时snmp的设置有些问题,因此linux系统下我打算使用nrpe插件调用的方式监控。

配置nagios(监控linux)
利用nrpe插件监控linux服务器
nagios监控端

check_omsa服务定义文件
# vi /usr/local/nagios/etc/objects/Dell_OMSA/dell_service_linux.cfg
define service {
host_name                       sns001
service_description             check_omsa
use                             linux-web-service
hostgroup_name                  DellLinuxHosts
check_command                   check_nrpe!check_omsa
_ser_info                       dell omsa
check_interval                  10
notification_options            c,r
}

主机组定义文件
define hostgroup{
hostgroup_name          DellLinuxHosts
alias                   dell Linux 服务器组
members                 sns001,snsdb001,rms001,rmsdb001,opt-001
}

nagios 主配置文件修改
# vi /usr/local/nagios/etc/nagios.cfg
cfg_file=/usr/local/nagios/etc/objects/Dell_OMSA/dell_service_linux.cfg

被控端nrpe配置

# vi /usr/local/nagios/etc/nrpe.cfg
添加
command[check_omsa]=/usr/local/nagios/libexec/check_openmanage -b ctrl_fw=ALL\/ctrl_driver=ALL -p
# service nrped restart

配置nagios(监控windows)
通过snmp方式监控windows服务器

# vi /usr/local/nagios/etc/objects/Dell_OMSA/dell_service_win.cfg
define service {
host_name                       heidrick
service_description             check_omsa
use                             win-rrd-service
hostgroup_name                  DellWinHosts
check_command                   check_omsa4win
_ser_info                       dell omsa
check_interval                  10
notification_options            c,r
}

nagios 主配置文件修改
# vi /usr/local/nagios/etc/nagios.cfg
添加
cfg_file=/usr/local/nagios/etc/objects/Dell_OMSA/dell_service_win.cfg

如果nagios安装了pnp4nagios插件的话,还可以显示出check_openmanage监测出的服务器风扇转速和机箱温度,如下图
check_omsa

详细资料可以参加:http://folk.uio.no/trondham/software/check_openmanage.html

Restart Windows Failed Service batch script with log.


File: win_service_restart.cmd
Author: Vadims Zenins http://vadimszenins.blogspot.com
Version: 1.04
Date: 20/04/2009 17:41
Windows Failed Service restart batch file for Nagios Event Handler
Copy win_service_restart.cmd to \NSClient++\scripts\ folder.
Nagios commands.cfg:
define command{
command_name win_service_restart
command_line $USER1$/check_nrpe -H $HOSTADDRESS$ -p 5666 -c win_service_restart -a "$SERVICEDESC$" $SERVICESTATE$ $SERVICESTATETYPE$ $SERVICEATTEMPT$
}
Nagios template-services_common-win.cfg
define service{
name generic-service-win-wuauserv
service_description wuauserv
display_name Automatic Updates
event_handler win_service_restart
event_handler_enabled 1
check_command check_nt!SERVICESTATE!-d SHOWALL -l $SERVICEDESC$
}
NSCLIENT++ NSC.ini:
[NRPE]
allowed_hosts=192.168.1.1/32 ; your Nagios server IP
allow_arguments=1
[External Script]
allow_arguments=1
allow_nasty_meta_chars=1
[NRPE Handlers]
command[win_service_restart]=scripts\win_service_restart.cmd "$ARG1$" $ARG2$ $ARG3$ $ARG4$
Version 1.04 revision:
Double restart of the servise is fixed
Version 1.03 revision:
Description is changed
Version 1.02 revision:
@NET changed to @SC
Version 1.01 revision:
Service name's with spase problem is fixed
http://vadimszenins.blogspot.com/2008/12/nagios-restart-windows-failed-services.html
md5: fd00753533e5fb655d824c3bf1d36d4f *win_service_restart.zip

Windows Failed Service restart batch file
Vadims Zenins
Tue, 20 Oct 2009 13:35:38 GMT