当在生产环境下使用某种服务时,相应的监控措施也应当完善起来,来检测服务是否正常和获取相关信息是很有必要的。
下面来说说使用nagios-plugins-rabbitmq来监控消息分发队列服务rabbitmq。目前提供6种检测类型:
1. check_rabbitmq_aliveness 使用/api/aliveness-test API来发送/接收消息。
2. check_rabbitmq_server 使用/api/nodes API来获取rabbitmq服务器节点的资源使用情况。
3. check_rabbitmq_objects 使用多种API来计算统计服务器上的各种对象实例。包括vhosts、exchanges、bindings、queues 、channels。
4. check_rabbitmq_overview 使用/api/overview API来收集pending、ready、unacknowledged消息。
5. check_rabbitmq_queue 使用 /api/queue API来收集pending、ready、unacknowledged消息和统计一个给定的队列的消费者的数量。
6. check_rabbitmq_watermark 使用/api/nodes API来确定mem_alarm是否设置为true。
1. 安装Nagios::Plugin perl模块
nagios-plugins-rabbitmq插件是以perl语言写的,需要安装Nagios::Plugin perl包。否则会报如下错误:
Can't locate Nagios/Plugin.pm in @INC (@INC contains: /usr/lib64/perl5/site_perl/5.8.8/x86_64-linux-thread-multi /usr/lib/perl5/site_perl/5.8.8 /usr/lib/perl5/site_perl /usr/lib64/perl5/vendor_perl/5.8.8/x86_64-linux-thread-multi /usr/lib/perl5/vendor_perl/5.8.8 /usr/lib/perl5/vendor_perl /usr/lib64/perl5/5.8.8/x86_64-linux-thread-multi /usr/lib/perl5/5.8.8 .) at ./check_rabbitmq_server line 12.
我这里使用cpanm来安装,参见《使用cpanm安装perl相关模块》。
1
2
3
4
5
|
# cpanm Nagios::Plugin
Building
and
testing
Nagios
-
Plugin
-
0.36
.
.
.
OK
Successfully
installed
Nagios
-
Plugin
-
0.36
39
distributions
installed
提示上面的信息说明安装成功。
|
大家也可以自己下载源码包进行编译安装。下载地址:http://search.cpan.org/CPAN/authors/id/T/TO/TONVOON/Nagios-Plugin-0.36.tar.gz
步骤如下:
1
2
3
4
5
6
|
# wget http://search.cpan.org/CPAN/authors/id/T/TO/TONVOON/Nagios-Plugin-0.36.tar.gz
# tar xvfz Nagios-Plugin-0.36.tar.gz
# cd Nagios-Plugin-0.36
# perl Makefile.PL
# make
# make install
|
2. 安装依赖模块
check_rabbitmq_* 如能正常使用,还需要安装下面依赖模块。
1
|
# cpanm LWP JSON
|
否则,会报如下错误
Can't locate LWP/UserAgent.pm in @INC
Can't locate JSON.pm in @INC
3. 下载nagios-plugins-rabbitmq
1
2
3
4
5
|
# cd /usr/local/nagios/libexec/
# wget --no-check-certificate https://github.com/jamesc/nagios-plugins-rabbitmq/archive/master.zip
# unzip master
# mv nagios-plugins-rabbitmq-master nagios-plugins-rabbitmq
# chown -R nagios.nagios nagios-plugins-rabbitmq/
|
4. 举例
1
2
3
4
5
6
7
8
|
# ./check_rabbitmq_aliveness -H 10.1.155.139 --port=15672 -u 'nagioscheck' -p 'www.ttlsa.com'
RABBITMQ_ALIVENESS
OK
-
vhost
:
/
# ./check_rabbitmq_overview -H 10.1.155.139 --port=15672 -u 'nagioscheck' -p 'www.ttlsa.com'
RABBITMQ_OVERVIEW
OK
-
messages
OK
(
2
)
messages_ready
OK
(
2
)
messages_unacknowledged
OK
(
0
)
|
messages
=
2
;
;
messages_ready
=
2
;
;
messages_unacknowledged
=
0
;
;
# ./check_rabbitmq_queue -H 10.1.155.139 --port=15672 -u 'nagioscheck' -p 'www.ttlsa.com' --queue=aliveness-test
RABBITMQ_QUEUE
OK
-
messages
OK
(
0
)
messages_ready
OK
(
0
)
messages_unacknowledged
OK
(
0
)
consumers
OK
(
0
)
|
messages
=
0
;
;
messages_ready
=
0
;
;
messages_unacknowledged
=
0
;
;
consumers
=
0
;
;
# ./check_rabbitmq_objects -H 10.1.155.139 --port=15672 -u 'nagioscheck' -p 'www.ttlsa.com'
RABBITMQ_OBJECTS
OK
-
Gathered
Object
Counts
|
vhost
=
1
;
;
exchange
=
15
;
;
binding
=
2
;
;
queue
=
1
;
;
channel
=
0
;
;
|
5. 定义nagios command
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
|
# vim /usr/local/nagios/etc/objects/commands.cfg
define
command
{
command_name
check_rabbitmq_aliveness
command
_line
$
USER1
$
/
nagios
-
plugins
-
rabbitmq
/
scripts
/
check_rabbitmq_aliveness
-
H
$
ARG1
$
--
port
=
$
ARG2
$
-
u
$
ARG3
$
-
p
$
ARG4
$
}
define
command
{
command_name
check_rabbitmq_overview
command
_line
$
USER1
$
/
nagios
-
plugins
-
rabbitmq
/
scripts
/
check_rabbitmq_overview
-
H
$
ARG1
$
--
port
=
$
ARG2
$
-
u
$
ARG3
$
-
p
$
ARG4
$
}
define
command
{
command_name
check_rabbitmq_queue
command
_line
$
USER1
$
/
nagios
-
plugins
-
rabbitmq
/
scripts
/
check_rabbitmq_queue
-
H
$
ARG1
$
--
port
=
$
ARG2
$
-
u
$
ARG3
$
-
p
$
ARG4
$
--
queue
$
ARG5
$
}
define
command
{
command_name
check_rabbitmq_objects
command
_line
$
USER1
$
/
nagios
-
plugins
-
rabbitmq
/
scripts
/
check_rabbitmq_objects
-
H
$
ARG1
$
--
port
=
$
ARG2
$
-
u
$
ARG3
$
}
|
用户名、密码可以定义到/usr/local/nagios/etc/resource.cfg 文件中,免得每次都要指定。
6. 创建rabbitmq监控项
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
|
define
service
{
use
generic
-
service
host
_name
121.207.22.33
service_description
check_rabbitmq_aliveness
normal_check
_interval
2
contact_groups
admin_4
check_command
check_rabbitmq_aliveness
!
10.1.22.33
!
15672
!
nagioscheck
!
www
.
ttlsa
.
com
}
define
service
{
use
generic
-
service
host
_name
121.207.22.33
service_description
check_rabbitmq_queue
normal_check
_interval
2
contact_groups
admin_4
check_command
check_rabbitmq_queue
!
10.1.22.33
!
15672
!
nagioscheck
!
www
.
ttlsa
.
com
!
aliveness
-
test
}
|
按照自己的需求,添加command和监控项。
转载请注明来自运维生存时间: http://www.ttlsa.com/html/4048.html