Zabbix监控ogg延迟情况

起因

最近ogg出现了一点问题,没有及时发现,于是考虑将ogg的监控也纳入zabbix当中来。对于这一类监控,考虑的地方不单单在于如何监控,而是善用zabbix的模板、自动发现等功能来实现,这样会方便配置以及后期的可扩展性。

获取OGG信息

首先对ogg的运行情况查看通常是通过info all命令

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
MANAGER     RUNNING                                           
EXTRACT RUNNING EPRISK 00:00:03 00:00:08
EXTRACT RUNNING ERDMEDW 00:00:04 00:00:06
EXTRACT RUNNING ESHDW 00:00:00 00:00:01
EXTRACT RUNNING PCIF 00:00:00 00:00:08
EXTRACT RUNNING PEIF 00:00:00 00:00:08
EXTRACT RUNNING PHG 00:00:00 00:00:08
EXTRACT RUNNING PQH 00:00:00 00:00:06
EXTRACT RUNNING PRISKMGR 00:00:00 00:00:08
EXTRACT RUNNING PSHDW 00:00:00 00:00:06
REPLICAT RUNNING RCIF 00:00:10 00:00:06
REPLICAT RUNNING REDMDB1 00:00:00 00:00:01
REPLICAT RUNNING REDQDB1 00:00:00 00:00:07
REPLICAT RUNNING REIF 00:00:00 00:00:00
REPLICAT RUNNING REPWIND 00:00:00 00:00:06
REPLICAT RUNNING REPWIND2 00:00:00 00:00:00
REPLICAT RUNNING REPWIND3 00:00:00 00:00:05
REPLICAT RUNNING REPWIND4 00:00:00 00:00:04
REPLICAT RUNNING RZNMGR 00:00:00 00:00:00
REPLICAT RUNNING RZNMGRC 00:00:00 00:00:01

这里总共有19个抽取、传输进程,根据实际需求只需要做到分钟级别监控即可,所以筛选出4列,分别为运行状态、进程名、lag时间、time列

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
echo "info all"|ggsci|egrep 'EXTRACT|REPLICAT'|awk -F"[ ]+|:" '{print $2,$3,$4*60+$5,$7*60+$8}'

RUNNING EPRISK 0 0
RUNNING ERDMEDW 0 0
RUNNING ESHDW 0 0
RUNNING PCIF 0 0
RUNNING PEIF 0 0
RUNNING PHG 0 0
RUNNING PQH 0 0
RUNNING PRISKMGR 0 0
RUNNING PSHDW 0 0
RUNNING RCIF 0 0
RUNNING REDMDB1 0 0
RUNNING REDQDB1 0 0
RUNNING REIF 0 0
RUNNING REPWIND 0 0
RUNNING REPWIND2 0 0
RUNNING REPWIND3 0 0
RUNNING REPWIND4 0 0
RUNNING RZNMGR 0 0
RUNNING RZNMGRC 0 0

通过shell将获取到的数据写入临时文件/etc/zabbix/scripts/ogg.cfg

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
#!/bin/bash
all()
{
source /home/oracle/.bash_profile;echo "info all"|ggsci|egrep 'EXTRACT|REPLICAT'|awk -F"[ ]+|:" '{print $2,$3,$4*60+$5,$7*60+$8}'
}

name()
{
source /home/oracle/.bash_profile;echo "info all"|ggsci|egrep 'EXTRACT|REPLICAT'|awk '{print $3}'
}

a=$1
case "$a" in
all)
all
;;
name)
name
;;
*)
echo -e "Usage: ./`basename $0` [all|name]"
esac

转换成zabbix可识别的Json格式

为了满足zabbix自动发现的要求,返回值必须要是json格式,所以可以对上面的数据进行格式化

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
#!/usr/bin/env python  
#coding:utf-8
import os, json, sys

# 生成包含所有信息的json串
def generate_json():
# 打开文本文件并读取数据
file_path = '/etc/zabbix/scripts/ogg.cfg'
with open(file_path, 'r') as txt_file:
lines = txt_file.readlines()

# 根据文本列数添加表头
num_columns = len(lines[0].split())
headers = ['STATUS', 'NAME', 'LAG', 'TIME']

# 解析文本数据,并将其转换为适当的数据结构
data = []
for line in lines:
values = line.strip().split()
data.append(dict(zip(headers, values)))

# 将数据结构转换为JSON格式
json_data = json.dumps(data, indent=4)

return json_data

# 构造以NAME为关键字的自动发现规则
def query_name():
# 打开文本文件并读取数据
file_path = '/etc/zabbix/scripts/ogg.cfg'
header = ['STATUS', '{#NAME}', 'LAG', 'TIME']
data = []

# 打开文件并逐行读取数据
with open(file_path, 'r') as file:
for line in file:
# 使用适当的分隔符拆分行为列
columns = line.strip().split()

# 提取第二列数据,并将其与表头一起存储在一个字典中
row = {header[1]: columns[1]}
data.append(row)

# 将数据结构转换为JSON格式
json_data = json.dumps(data, indent=4)

return json_data

args = sys.argv
if len(args) == 1:
print generate_json()
elif args[1] == 'query_name':
print query_name()
else:
print "输入错误!"

最后得到的数据

获取所有信息:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
[root@ scripts]# ./ogg_delay.py
[
{
"STATUS": "RUNNING",
"LAG": "0",
"NAME": "EPRISK",
"TIME": "0"
},
{
"STATUS": "RUNNING",
"LAG": "0",
"NAME": "ERDMEDW",
"TIME": "0"
},
{
"STATUS": "RUNNING",
"LAG": "0",
"NAME": "ESHDW",
"TIME": "0"
},
{
"STATUS": "RUNNING",
"LAG": "0",
"NAME": "PCIF",
"TIME": "0"
},
{
"STATUS": "RUNNING",
"LAG": "0",
"NAME": "PRISKMGR",
"TIME": "0"
},
{
"STATUS": "RUNNING",
"LAG": "0",
"NAME": "PSHDW",
"TIME": "0"
},
{
"STATUS": "RUNNING",
"LAG": "0",
"NAME": "RCLS",
"TIME": "0"
},
{
"STATUS": "RUNNING",
"LAG": "0",
"NAME": "REDMDB1",
"TIME": "0"
},
{
"STATUS": "RUNNING",
"LAG": "0",
"NAME": "REDQDB1",
"TIME": "0"
},
{
"STATUS": "RUNNING",
"LAG": "0",
"NAME": "REPWIND",
"TIME": "0"
},
{
"STATUS": "RUNNING",
"LAG": "0",
"NAME": "REPWIND2",
"TIME": "0"
},
{
"STATUS": "RUNNING",
"LAG": "0",
"NAME": "REPWIND3",
"TIME": "0"
},
{
"STATUS": "RUNNING",
"LAG": "0",
"NAME": "RZNMGR",
"TIME": "0"
},
{
"STATUS": "RUNNING",
"LAG": "0",
"NAME": "RZNMGRC",
"TIME": "0"
}
]

只获取名称信息:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
[root@ scripts]# ./ogg_delay.py query_name
[
{
"{#NAME}": "EPRISK"
},
{
"{#NAME}": "ERDMEDW"
},
{
"{#NAME}": "ESHDW"
},
{
"{#NAME}": "PCIF"
},
{
"{#NAME}": "PRISKMGR"
},
{
"{#NAME}": "PSHDW"
},
{
"{#NAME}": "RCLS"
},
{
"{#NAME}": "REDMDB1"
},
{
"{#NAME}": "REDQDB1"
},
{
"{#NAME}": "REPWIND"
},
{
"{#NAME}": "REPWIND2"
},
{
"{#NAME}": "REPWIND3"
},
{
"{#NAME}": "RZNMGR"
},
{
"{#NAME}": "RZNMGRC"
}
]

创建模版

数据都已经获取到,到这里脚本就准备完毕,需要添加两个自定义key

1
2
UserParameter=ogg.discovery,/etc/zabbix/scripts/ogg.py
UserParameter=ogg_name,/etc/zabbix/scripts/ogg.py query_name

在实现方式上,为了提高效率,采用dependent item+预处理的方式来进行,这样就避免每个监控项频繁的去处理文本,只用一次处理得到所有结果。

  1. 新建监控项获取所有的ogg信息,调用自定义监控项ogg.discovery

  2. 新建自动发现规则获取所有的ogg名称信息,调用自定义监控项ogg_name

  3. 新增3个监控项,分别对第1步得到的数据进行处理,通过预处理中的jsonpath解析直接得到结果。

    1
    2
    3
    状态:$[?(@.NAME=='{#NAME}')].STATUS.first()
    应用延迟:$[?(@.NAME=='{#NAME}')].LAG.first()
    传输延迟:$[?(@.NAME=='{#NAME}')].TIME.first()

在zabbix上新建模板,配置宏参数以及新增监控项和触发器等,最后只需要将模板添加到对应监控的主机即可




Zabbix监控ogg延迟情况
https://www.xbdba.com/2020/07/09/zabbix-monitor-ogg-delay/
作者
xbdba
发布于
2020年7月9日
许可协议