การติดตั้ง Nagios 3.0 และ Nagios-plugins (Host and Service Monitoring)
Nagios-plugins คือ เป็น Plugins ที่ใช้ในการตรวจ system, memory usage, cpu utilization อื่นๆ เป็นต้น.
สำหรับท่านไหนที่เลยติดตั้งจาก YaST แล้วต้อง Uninstall โปรแกรมออกก่อนนะครับ
- nagios
- nagios-nsca
- nagios-nsca-client
- nagios-plugins
- nagios-plugins-extras
- nagios-www
สำหรับท่านไหนที่ยังไม่ได้ติดตั้ง ก็ไม่ต้องสนใจด้านบนครับ
ตรวจสอบโปรแกรมว่าคุณติดตั้งไปยัง
- gd-devel
- libpng-devel
1. ดาวห์โหลดโปรแกรม
$ cd เข้าไปยัง folder ที่คุณเก็บโปรแกรมไว้
$ tar -zxvf nagios-3.0b6.tar.gz
$ tar -zxvf nagios-plugins-1.4.10.tar.gz
2. สร้าง user และ group
$ useradd -m nagios
$ groupadd nagios
$ groupadd nagcmd
$ usermod -G nagios,nagcmd nagios
$ usermod -G nagcmd wwwrum
3. ติดตั้ง Nagios 3.0
$ cd /nagios-3.0b6
$ ./configure --prefix=/opt/nagios --with-cgiurl=/nagios/cgi-bin --with-htmurl=/nagios --with-nagios-user=nagios \
--with-nagios-group=nagios --with-command-group=nagcmd
$ make all
$ make install
$ make install-init
$ make install-commandmode
$ make install-config
$ make install-webconf
4. ติดตั้ง Nagios Plugins 1.4.10
$ cd nagios-plugins-1.4.10
$ ./configure --prefix=/opt/nagios --with-nagios-user=nagios --with-nagios-group=nagios
$ make
$ make install
5. คอนฟิก Nagios 3.0
$ vi /opt/nagios/etc/nagios.cfg
log_file=/var/opt/nagios/nagios.log
object_cache_file=/var/opt/nagios/objects.cache
precached_object_file=/var/opt/nagios/objects.precache
status_file=/var/opt/nagios/rw/nagios.cmd
lock_file=/var/opt/nagios/nagios.tmp
log_archive_path=/var/opt/nagios/archive
check_result_path=/var/opt/nagios/spool/retention.dat
state_retention_file=/var/opt/nagios/retention.dat
debug_file=/var/opt/nagios/nagios.debug
6. สร้าง Directories
$ mkdir -p /var/opt/nagios/rw
$ mkdir -p /var/opt/nagios/spool/checkresults
$ mkdir -p /var/opt/nagios/archives
$ chown -R nagios.nagios /var/opt/nagios
$ chown -R nagios.nagcmd /var/opt/nagios/rw
$ chmod 2775 /var/opt/nagios/rw
7. Apache Security
$ htpasswd2 -c /opt/nagios/etc/htpasswd.users sysadmin
Password : Your_password
8. Apache และ Nagios Startup
$ rcapache2 restart
$ /etc/rc.d/init.d/nagios start
9. Automatic Startup at systerm boot time
$ insserv nagios
10. ทดสอบการทำงาน
URL: http://<IP Address Server>/nagios
Username is "sysadmin"
Password is "Your_password"
Nagios Error: Could not open command file '/var/nagios/rw/nagios.cmd' for update!
Solution: change group from "nagios" to "www"
$ id nagios
$ cd /opt/nagios/var/rw/
$ chgrp www nagios.cmd
Adding remote Linux/Unix hosts:
ตัวอย่าง (Defalut)
$ vi /opt/nagios/etc/objects/localhost.cfg
##Added by Sontaya
define host {
use linux-server
host_name hostname
alias hostname.mydomain
address Public IP Address / Private IP Address
}
define service {
use local-service
host_name hostname
service_description PING
check_command check_ping!100.0,20%!500.0,60%
}
define service {
use local-service
host_name hostname
service_description SSH
check_command check_ssh
}
# Define a service to check the disk space of the root partition
# on the local machine. Warning if < 20% free, critical if
# < 10% free space on partition.
define service{
use local-service
host_name hostname
service_description Root Partition
check_command check_local_disk!20%!10%!/
}
# Define a service to check the number of currently logged in
# users on the local machine. Warning if > 10 users, critical
# if > 20 users.
define service{
use local-service
host_name hostname
service_description Current Users
check_command check_local_users!15!20
}
define service{
use local-service
host_name hostname
service_description Total Processes
check_command check_local_procs!250!400!RSZDT
}
define service{
use local-service
host_name hostname
service_description Current Load
check_command check_local_load!5.0,4.0,3.0!10.0,6.0,4.0
}
# Define a service to check the swap usage the local machine.
# Critical if less than 10% of swap is free, warning if less than 20% is free
define service{
use local-service ; Name of service template to use
host_name hostname
service_description Swap Usage
check_command check_local_swap!20!10
}
Adding email contacts:
$ vi /opt/nagios/etc/objects/contacts.cfg
email nagios@hostname,
admin@mydomain.com
Restart nagios:
$ /etc/rc.d/init.d/nagios restart
Ref:
http://nagios.sourceforge.net/docs/3_0/plugins.html
http://www.novell.com/coolsolutions/feature/16723.html
http://code.google.com/p/onebusaway/wiki/NagiosConfiguration#Nagios_User_and_Group
ติดตั้ง NRPE บนเครื่อง Remote Hosts (Client): Daemon and plugin for executing plugins on remote hosts
โดยปกติแล้วการ monitor จะเป็นการรันผ่าน plugins ที่ Nagios Server ไปยังเครื่องที่ต้องการจะ monitor โดยส่วนมากเป็นการส่ง message
ไปและ response กลับมา เช่น plugins check_ping , check_http แต่การ monitor บางอย่างไม่สามารถใช้วิธีนี้ได้
เช่น check_load , check_disk เป็นต้น ซึ่ง plugins เหล่านี้สามารถทำงานได้ในเครื่อง local เท่านั้น.
ไฟล์ที่สำคัญ:
check_nrpe คือ Plugin ที่ใช้ในการจัดการ nrpe บนเครื่อง remote host.
nrpe คือ Agent ที่รันบนเครื่อง remote host และใช้ในการติดต่อกับ plugin.
nrpe.cfg คือ ไฟล์คอนฟิกของเครื่อง remote host
1. สร้างบัญชีผู้ใช้/กลุ่มผู้ใช้
$ useradd nagios
$ passwd nagios
กำหนดรหัสผ่านเป็น "nagios"
$ groupadd nagios
2. ดาวห์โหลด Nagios Plugins:
$ mkdir -p /opt/nagios/
ดาวห์โหลดไฟล์ และบันทึกลงไว้ที่ "/opt/nagios/" ดาวห์โหลดจาก
http://www.nagios.org/download/download.php
(nagios-plugins-1.4.13.tar.gz)
$ tar zxvf nagios-plugins-1.4.13.tar.gz
3. ติดตั้ง Nagios Plugins
*** ตรวจสอบ openssl-devel ว่าติดตั้งยัง ถ้ายังให้ทำการติดตั้งจาก YaST ก่อน เพราะ plugin สนับสนุน ssl.
$ cd nagios-plugins-1.4.13
$ ./configure --with-nagios-user=nagios --with-nagios-group=nagios
$ make
$ make install
4. กำหนด permissions โฟร์เดอร์ plugin:
$ chown nagios.nagios /usr/local/nagios
$ chown -R nagios.nagios /usr/local/nagios/libexec
5. ติดตั้ง NPRE Daemon
ดาวห์โหลดไฟล์ และบันทึกลงไว้ที่ "/opt/nagios/" ดาวห์โหลดจาก
http://www.nagios.org/download/download.php
(nrpe-2.12.tar.gz)
$ tar zxvf nrpe-2.12.tar.gz
$ cd nrpe-2.12
$ ./configure
$ make all
$ make install-plugin
$ make install-daemon
$ make install-daemon-config
$ make install-xinetd
6. คอนฟิกพอร์ต NRPE
$ vi /etc/xinetd.d/nrpe
เพิ่ม IP Address Nagios ในบรรทัด
only_from = 127.0.0.1 192.168.1.13
จากนั้นบันทึกไฟล์
$ vi /etc/services
ทำการเพิ่มพอร์ต 5666 เข้าในไฟล์ services
nrpe 5666/tcp # NRPE
รีสตาร์ Xinetd
$ rcxinetd restart
ตรวจสอบ NRPE daemon
$ netstat -at | grep nrpe
tcp 0 0 *:nrpe *:* LISTEN
ทดสอบเวอร์ชั่นของ NRPE
$ /usr/local/nagios/libexec/check_nrpe -H localhost
NRPE v2.12
7. คอนฟิก Firewall (iptables)
ทำการเปิดพอร์ต 5666
$ iptables -A INPUT -m state --state NEW -m tcp -p tcp --dport 5666 -j ACCEPT
ตรวจสอบ
$ netstat -ntlp | grep 5666
tcp 0 0 0.0.0.0:5666 0.0.0.0:* LISTEN -
Tip:
$ vi /etc/sysconfig/scripts/SuSEfirewall2-custom
#NRPE
iptables -A INPUT -m state --state NEW -m tcp -p tcp --dport 5666 -j ACCEPT
$ rcSuSEfirewall2 reload
8. แก้ไขไฟล์ nrpe.cfg
$ vi /usr/local/nagios/etc/nrpe.cfg
command[check_users]=/usr/local/nagios/libexec/check_users -w 5 -c 10
command[check_load]=/usr/local/nagios/libexec/check_load -w 15,10,5 -c 30,25,20
command[check_disk]=/usr/local/nagios/libexec/check_disk -w 20% -c 10%
command[check_zombie_procs]=/usr/local/nagios/libexec/check_procs -w 5 -c 10 -s Z
command[check_total_procs]=/usr/local/nagios/libexec/check_procs -w 150 -c 200
*** สามารถสร้าง command ได้ 2 แบบ คือ กำหนดค่า argument กับไม่กำหนดค่า argument
(ในตัวอย่างนี้ใช้แบบกำหนดค่า argument คือส่งค่า threshold หรือรับค่า argument มาจากเครื่อง Nagios Server)
===================================================================================
คอนฟิก Nagios Server
===================================================================================
1. ทดสอบ Telnet
ทดสอบ telnet เข้าเครื่อง Remote host (Client)
$ telnet 192.168.11.3 5666
Trying 192.168.11.3...
Connected to 192.168.11.3.
Escape character is '^]'.
^]
telnet> quit
Connection closed.
2. คอนฟิกไฟล์ commands.conf
$ /opt/nagios/etc/objects/commands.cfg
# NRPE CHECK COMMAND
# Command to use NRPE to check remote host systems
#
###############################################################################
#
define command{
command_name check_nrpe
command_line $USER1$/check_nrpe -H $HOSTADDRESS$ -c $ARG1$
}
*** เราสามารถสร้าง command ไว้ 2 ตัวได้ แยกส่วนเวลาเรียกใช้งานแบบส่งและไม่ส่ง argument.
3. คอนฟิกไฟล์ host-linux.cfg
รูปแบบการกำหนดค่า check_nrpe!command plugins!argument threshold1 arg2 arg3
check_nrpe!check_procs!5 !10 !Z
Tip: ระหว่าง argument ให้เว้นวรรค
$ /opt/nagios/etc/objects/host-linux.cfg
#----------------------------------------------------------------------------------------------------------------
#192.168.11.3
#----------------------------------------------------------------------------------------------------------------
define host {
use linux-server
host_name bclinux3
alias bclinux3.mydomain
address 192.168.11.3
hostgroups linux-servers
}
#CRITICAL if the round trip average (RTA) is greater than 600 milliseconds
#or the packet loss is 60% or more
#WARNING if the RTA is greater than 200 ms or the packet loss is 20% or more
#OK if the RTA is less than 600 ms and the packet loss is less than 20%
define service {
use generic-service
host_name bclinux3
service_description PING
check_command check_ping!200.0,20%!600.0,60%
}
# Define a service to check the disk space of the root partition
# on the local machine. Warning if < 20% free, critical if
# < 10% free space on partition.
define service{
use generic-service
host_name bclinux3
service_description Free Root Partition
check_command check_nrpe!check_disk -w 20% -c 10% -p /dev/sda2
}
# Define a service to check the number of currently logged in
# users on the local machine. Warning if > 5 users, critical
# if > 10 users.
define service{
use generic-service
host_name bclinux3
service_description Current Users
check_command check_nrpe!check_users -w 5 -c 10
}
define service{
use generic-service
host_name bclinux3
service_description Total Processes
check_command check_nrpe!check_total_procs
}
define service{
use generic-service
host_name bclinux3
service_description Current Load
check_command check_nrpe!check_load -w 15,10,5 -c 30,25,20
}
define service{
use generic-service
host_name bclinux3
service_description Zombie Processes
check_command check_nrpe!check_zombie_procs
}
บันทึกไฟล์
Tip:
Usage:check_users -w <users> -c <users>
-w, --warning=INTEGER
Set WARNING status if more than INTEGER users are logged in
-c, --critical=INTEGER
Set CRITICAL status if more than INTEGER users are logged in
ตรวจไฟล์คอนฟิก:
$ /opt/nagios/bin/nagios -v /opt/nagios/etc/nagios.cfg
ถ้าไม่มี error ให้รีสตาร์ Nagios
Restart Nagios:
$ /etc/rc.d/init.d/nagios reload
=========================================================================
Note: Object configuration files:
=========================================================================
Timeperiods:
$ vi /opt/nagios/etc/objects/timeperiods.cfg
Contacts/Contacts groups:
$ vi /opt/nagios/etc/objects/contacts.cfg
#Adding email contacts:
email nagios@hostname,
sontaya@mydomain.com
Adding remote Linux/Unix hosts:
$ vi /opt/nagios/etc/objects/host-linux.cfg
Templates Services: (CONTACT, HOST, SERVICE)
$ vi /opt/nagios/etc/objects/templates.cfg
COMMANDS:
$ vi /opt/nagios/etc/objects/commands.cfg
Restart nagios:
$ /etc/rc.d/init.d/nagios restart
Path plugins:
$ vi /opt/nagios/etc/resource.cfg
$USER1$=/opt/nagios/libexec
Path NRPE config
$ vi /usr/local/nagios/etc/nrpe.cfg
Usage:check_ping -H <host_address> -w <wrta>,<wpl>% -c <crta>,<cpl>%
check_command check_ping!100.0,20%!500.0,60%
===========================================================================
Tip: How to check a host, that for security reasons has ping disabled
===========================================================================
1. Copy "check_nrpe" file to path plugin.
$ cp /usr/local/nagios/libexec/check_nrpe /opt/nagios/libexec
2. Define the service and attributes within the default services file.
(place check_nrpe! in front of the check-host-alive)
$ vi /opt/nagios/etc/objects/templates.cfg
## SERVICE TEMPLATES
check_command check_nrpe!check-host-alive
4. Add the command to every client’s nrpe.cfg file
$ vi /usr/local/nagios/etc/nrpe.cfg
3. Reload Nagios (Finished)
$ /etc/rc.d/init.d/nagios reload
===========================================================================
Error messages:
===========================================================================
Could not open command file '/var/nagios/rw/nagios.cmd' for update!
Solution: change group "nagios" to "www"
$ id nagios
$ cd /opt/nagios/var/rw/
$ chgrp www nagios.cmd
============================================================================
Tip & Install NagiosGrapher
============================================================================
To see which pre-requisites are installed:
install.pl --check-prereq
To install pre-requisites:
Debian/Ubuntu
sudo apt-get install libcgi-pm-perl librrds-perl libgd-gd2-perl
Redhat/Fedora/CentOS
sudo yum install perl-rrdtool perl-GD
Easy Install
------------
install.pl
To see a list of options:
install.pl --help
Recipe for Manual Installation
------------------------------
These instructions assume an overlay layout, with nagios at /usr/local/nagios
- Extract nagiosgraph into a temporary location:
cd /tmp
tar xzvf nagiosgraph-x.y.z.tgz
- Copy the contents of etc into your preferred configuration location:
mkdir /etc/nagiosgraph
cp etc/* /etc/nagiosgraph
- Edit the perl scripts in the cgi and lib directories, modifying the
"use lib" line to point to the directory from the previous step.
vi cgi/*.cgi lib/insert.pl
- Copy insert.pl to a location from which it can be executed:
cp lib/insert.pl /usr/local/nagios/libexec
- Copy CGI scripts to a script directory served by the web server:
cp cgi/*.cgi /usr/local/nagios/sbin
- Copy CSS and JavaScript files to a directory served by the web server:
cp share/nagiosgraph.css /usr/local/nagios/share
cp share/nagiosgraph.js /usr/local/nagios/share
- Edit /etc/nagiosgraph/nagiosgraph.conf. Set at least the following:
logfile = /var/log/nagiosgraph.log
cgilogfile = /var/log/nagiosgraph-cgi.log
perflog = /var/nagios/perfdata.log
rrddir = /var/nagios/rrd
mapfile = /etc/nagiosgraph/map
nagiosgraphcgiurl = /nagios/cgi-bin
javascript = /nagios/nagiosgraph.js
stylesheet = /nagios/nagiosgraph.css
- Set permissions of "rrddir" (as defined in nagiosgraph.conf) so that
the *nagios* user can write to it and the *www* user can read it:
mkdir /var/nagios/rrd
chown nagios /var/nagios/rrd
chmod 755 /var/nagios/rrd
- Set permissions of "logfile" so that the *nagios* user can write to it:
touch /var/log/nagiosgraph.log
chown nagios /var/log/nagiosgraph.log
chmod 644 /var/log/nagiosgraph.log
- Set permissions of "cgilogfile" so that the *www* user can write to it:
touch /var/log/nagiosgraph-cgi.log
chown www /var/log/nagiosgraph-cgi.log
chmod 644 /var/log/nagiosgraph-cgi.log
- Ensure that the *nagios* user can create and delete perfdata files:
chown nagios /var/nagios
chmod 755 /var/nagios
- In the Nagios configuration file (nagios.cfg) add this:
process_performance_data=1
service_perfdata_file=/var/nagios/perfdata.log
service_perfdata_file_template=$LASTSERVICECHECK$||$HOSTNAME$||$SERVICEDESC$||$SERVICEOUTPUT$||$SERVICEPERFDATA$
service_perfdata_file_mode=a
service_perfdata_file_processing_interval=30
service_perfdata_file_processing_command=process-service-perfdata
- In the Nagios commands file (commands.cfg) add this:
define command {
command_name process-service-perfdata
command_line /usr/local/nagios/libexec/insert.pl
}
- Check the nagios configuration
/usr/local/nagios/bin/nagios -v /usr/local/nagios/etc/nagios.cfg
- Restart nagios
/etc/init.d/nagios restart
- Verify that nagiosgraph is working by running showconfig.cgi
http://server/nagios/cgi-bin/showconfig.cgi
- Try graphing some data by running show.cgi
http://server/nagios/cgi-bin/show.cgi
- In the Nagios configuration, add a template for graphed services:
define service {
name graphed-service
action_url /nagiosgraph/cgi-bin/show.cgi?host=$HOSTNAME$&service=$SERVICEDESC$' onMouseOver='showGraphPopup(this)' onMouseOut='hideGraphPopup()' rel='/nagiosgraph/cgi-bin/showgraph.cgi?host=$HOSTNAME$&service=$SERVICEDESC$&period=week&rrdopts=-w+450+-j
register 0
}
- Enable graph links for services by appending the graphed-service to existing
service definitions in the Nagios configuration:
define service {
use local-service,graphed-service
...
}
- Replace the Nagios action icon with the nagiosgraph graph icon:
mv /usr/local/nagios/share/images/action.gif /usr/local/nagios/share/images/action.gif-orig
cp share/graph.gif /usr/local/nagios/share/images/action.gif
- In the nagiosgraph SSI file, set the URL for nagiosgraph.js:
vi share/nagiosgraph.ssi
src="/nagiosgraph/nagiosgraph.js" -> src="/nagios/nagiosgraph.js"
- Install the nagiosgraph SSI file:
cp share/nagiosgraph.ssi /usr/local/nagios/share/ssi/common-header.ssi
- Add links to graphs in the Nagios sidebar (side.php or side.html):
<ul>
<li><a href="/nagios/cgi-bin/show.cgi" target="main">Graphs</a></li>
<li><a href="/nagios/cgi-bin/showhost.cgi" target="main">Graphs by Host</a></li>
<li><a href="/nagios/cgi-bin/showservice.cgi" target="main">Graphs by Service</a></li>
<li><a href="/nagios/cgi-bin/showgroup.cgi" target="main">Graphs by Group</a></li>
</ul>
- Check the nagios configuration
/usr/local/nagios/bin/nagios -v /usr/local/nagios/etc/nagios.cfg
- Restart nagios
/etc/init.d/nagios restart
สร้าง template host
# This is my template
define host{
name hostTemplate
check_command check-host-alive
max_check_attempts 5
contact_groups admins
notifi cation_interval 30
notifi cation_period 24x7
notifi cation_options d,u,r
register 0
}
# myHost is shorter now that it inherits from hostTemplate
define host{
host_name myHost
alias My Favorite
Host address 192.168.1.254
parents myotherhost
use hostTemplate
}
compile configure
$/usr/local/nagios/bin/nagios -v /usr/local/nagios/etc/nagios.cfg