本文共 7278 字,大约阅读时间需要 24 分钟。
我们的网络和服务器要被细心的照看,要不然你会有很多麻烦的
还是找个工具来帮助我们管理这些让人牵挂的机器吧----nagios 在这里简单的配置了一下监控主机,深入的配置需要多多学习,网上的资料很多 下面简单的写出来我自己的配置,也许有不对的地方,请指出,谢谢,您的参阅 1、创建nagios用户 adduser nagios mkdir /usr/local/nagios chown nagios.nagios /usr/local/nagios 3、解压下载的安装包 tar xzvf nagios-version.tar.gz 2、编译 进入到解压目录执行 ./configure --prefix=/usr/local/nagios --with-nagios-user=nagios --with-nagios-group=nagios --with-command-group= nagios make all make install make install-config make install-init nagios安装完成 3、创建访问nagios的认证用户 /usr/bin/htpasswd -c /usr/local/nagios/etc/htpasswd andy 按照提示设置密码。 Apache已经安装好 到目录/usr/local/apache/ 4、将nagios的信息加到apache中,打开/usr/local/apache/conf/httpd.conf文件,在文件最后添加如下代码: ScriptAlias /nagios/cgi-bin /usr/local/nagios/sbin <Directory "/usr/local/nagios/sbin"> Options ExecCGI AllowOverride None Order allow,deny Allow from all AuthName "Nagios Access" AuthType Basic AuthUserFile /usr/local/nagios/etc/htpasswd Require valid-user </Directory> Alias /nagios /usr/local/nagios/share <Directory "/usr/local/nagios/share"> Options None AllowOverride None Order allow,deny Allow from all AuthName "Nagios Access" AuthType Basic AuthUserFile /usr/local/nagios/etc/htpasswd Require valid-user </Directory> 5、重启apache 访问nagios http://IP/nagios 6、插件安装 本文中用到的是 nagios-plugins-1.4.13.tar.gz 解压缩 Tar zxvf nagios-plugins-1.4.13.tar.gz 进入到解压缩目录nagios-plugins-1.4.13 编译 ./configure –prefix=/usr/local/nagios #和nagios安装目录相同 安装 Make Make install 7、nagios 具体配置 (一) 首先编辑/usr/local/nagios/etc/objects/localhost.cfg、我把要监控的主机都放在localhost.cfg文件中 ================================================ 定义要监控的主机 (oracle.test.com、CVS) define host{ use linux-server host_name oracle.test.com alias Database address 192.168.1.176 contact_groups admins parents routergw icon_image server.gif statusmap_image server.gd2 2d_coords 500,200 3d_coords 500,200,100 } define host{ use linux-server host_name CVS alias CVS address 192.168.1.183 contact_groups admins parents routergw icon_image server.gif statusmap_image server.gd2 2d_coords 500,200 3d_coords 500,200,100 } ================================================ 定义主机组和服务组 define hostgroup{ hostgroup_name linux-servers ; The name of the hostgroup alias Linux Servers ; Long name of the group members * ; Comma separated list of hosts that belong to this group } define servicegroup{ servicegroup_name linuxserv alias services members oracle.test.com,SSH,oracle.test.com,PING,oracle.test.com,disk,CVS,PING,CVS,disk,CVS,SSH } ================================================ 定义要监控的服务 define service{ use local-service host_name oracle.test.com,CVS service_description PING check_command check_ping!100.0,20%!500.0,60% notifications_enabled 1 } # Define a service to check the disk space of the root partition # on the local machine. Warning if < 20% free, critical if # < 10% free space on partition. define service{ use local-service te host_name oracle.test.com,CVS service_description disk check_command check_local_disk!20%!10%!/ notifications_enabled 1 } define service{ use local-service host_name oracle.test.com,CVS service_description users check_command check_local_users!20!50 notifications_enabled 1 } # Define a service to check the number of currently running procs # on the local machine. Warning if > 250 processes, critical if # > 400 users. define service{ use local-service host_name oracle.test.com,CVS service_description procs check_command check_local_procs!250!400!RSZDT notifications_enabled 1 } # Define a service to check the load on the local machine. define service{ use local-service host_name oracle.test.com,CVS service_description local_load check_command check_local_load!5.0,4.0,3.0!10.0,6.0,4.0 notifications_enabled 1 } define service{ use local-service host_name oracle.test.com,CVS service_description swap check_command check_local_swap!20!10 notifications_enabled 1 } # Define a service to check SSH on the local machine. # Disable notifications for this service by default, as not all users may have SSH enabled. define service{ use local-service host_name oracle.test.com,CVS service_description SSH check_command check_tcp!22!1.0!10.0 notifications_enabled 1 } # Define a service to check HTTP on the local machine. # Disable notifications for this service by default, as not all users may have HTTP enabled. define service{ use local-service host_name oracle.test.com service_description HTTP check_command check_http notifications_enabled 1 } ================================================ (二)编辑联系人/usr/local/nagios/etc/objects/contacts.cfg define contact{ contact_name nagiosadmin use generic-contact alias Nagios Admin service_notification_commands notify-service-by-sms, notify-service-by-email host_notification_commands notify-host-by-sms, notify-service-by-email email andylhz@XX.com 紧急事件发送邮件地址 pager 138***** 紧急事件发送手机短信号码 } =============================================== (三)编辑/usr/local/nagios/etc/objects/commands.cfg 命令定义文件 定义服务命令 define command{ command_name notify-service-by-sms command_line /usr/bin/sms -f 138******5 -p ****** -t $CONTACTPAGER$ -m "$HOSTNAME$ $SERVICEDESC$ is $SERVICESTATE$on $TIME$ result is$SERVICEOUTPUT$" $CONTACTPAGER$ } 定义主机命令 define command{ command_name notify-host-by-sms command_line /usr/bin/sms -f 138******5 -p ****** -t $CONTACTPAGER$ -m "Host $HOSTSTATE$ alert for $HOSTNAME$! on '$DATETIME$' " $CONTACTPAGER$ } =============================================== 注意::上面的发送短信程序sms 需要单独安装,网上很多,在此不做说明 测试配置是否正确 /usr/local/nagios/bin/nagios –v /usr/local/nagios/etc/nagios.cfg 如果没有问题的话回有如下的显示那就表是没有问题了 Nagios Core 3.2.0 Copyright (c) 2009 Nagios Core Development Team and Community Contributors Copyright (c) 1999-2009 Ethan Galstad Last Modified: 08-12-2009 License: GPL Website: http://www.nagios.org Reading configuration data... Read main config file okay... Processing object config file '/usr/local/nagios/etc/objects/commands.cfg'... Processing object config file '/usr/local/nagios/etc/objects/contacts.cfg'... Processing object config file '/usr/local/nagios/etc/objects/timeperiods.cfg'... Processing object config file '/usr/local/nagios/etc/objects/templates.cfg'... Processing object config file '/usr/local/nagios/etc/objects/localhost.cfg'... Processing object config file '/usr/local/nagios/etc/objects/switch.cfg'... Read object config files okay... Running pre-flight check on configuration data... Checking services... Checked 16 services. Checking hosts... Checked 3 hosts. Checking host groups... Checked 2 host groups. Checking service groups... Checked 1 service groups. Checking contacts... Checked 1 contacts. Checking contact groups... Checked 1 contact groups. Checking service escalations... Checked 0 service escalations. Checking service dependencies... Checked 0 service dependencies. Checking host escalations... Checked 0 host escalations. Checking host dependencies... Checked 0 host dependencies. Checking commands... Checked 26 commands. Checking time periods... Checked 5 time periods. Checking for circular paths between hosts... Checking for circular host and service dependencies... Checking global event handlers... Checking obsessive compulsive processor commands... Checking misc settings... Total Warnings: 0 Total Errors: 0 启动 nagios service nagios start
本文转自andylhz 51CTO博客,原文链接:http://blog.51cto.com/andylhz2009/211044,如需转载请自行联系原作者