Juniper Telemetry 遥测配置示例
使用 telegraf 为 gRPC(Junos 遥测接口或 JTI)和 gNMI 配置遥测的示例。
参考文档:
https://supportportal.juniper.net/s/article/telemetry-configuration-example?language=en_US
https://www.juniper.net/documentation/us/en/software/junos/interfaces-telemetry/index.html
配置方法
配置 gRPC 服务 (telemetry sensor) 和 SSH:
1
2
3
4
5
set system services ssh
set system services extension-service request-response grpc clear-text address 10.0.0.2
set system services extension-service request-response grpc clear-text port 32767
set system services extension-service request-response grpc skip-authentication
set system services extension-service notification allow-clients address 10.0.0.1/32
配置 gRPC 服务防火墙策略
1
2
3
4
5
6
set policy-options prefix-list pl_JTI_server 10.0.0.1/32
set firewall family inet filter ACCEPT-JTI term ACCEPT-JTI from source-prefix-list pl_JTI_server
set firewall family inet filter ACCEPT-JTI term ACCEPT-JTI from protocol tcp
set firewall family inet filter ACCEPT-JTI term ACCEPT-JTI from port 32768
set firewall family inet filter ACCEPT-JTI term ACCEPT-JTI then count jti-count
set firewall family inet filter ACCEPT-JTI term ACCEPT-JTI then accept
配置安装 Telegraf :
个人使用的是Ubuntu / Debian,添加repo 并设置一个新的 sources.list文件
1
2
3
4
5
6
# influxdata-archive_compat.key GPG fingerprint:
# 9D53 9D90 D332 8DC7 D6C8 D3B9 D8FF 8E1F 7DF8 B07E
wget -q https://repos.influxdata.com/influxdata-archive_compat.key
echo '393e8779c89ac8d958f81f942f9ad7fb82a25e133faddaf92e15b16e6ac9ce4c influxdata-archive_compat.key' | sha256sum -c && cat influxdata-archive_compat.key | gpg --dearmor | sudo tee /etc/apt/trusted.gpg.d/influxdata-archive_compat.gpg > /dev/null
echo 'deb [signed-by=/etc/apt/trusted.gpg.d/influxdata-archive_compat.gpg] https://repos.influxdata.com/debian stable main' | sudo tee /etc/apt/sources.list.d/influxdata.list
sudo apt-get update && sudo apt-get install telegraf
Telegraf 的配置文件,位于 /etc/telegraf/telegraf.conf
可以使用以下两种方法,二选一:
方法一: 使用 inputs.jti_openconfig_telemetry
这个插件使用 Junos Telemetry Interface (JTI),读取Juniper 的OpenConfig遥测数据的实现。
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
# inputs.jti_openconfig_telemetry
[agent]
interval = "10s"
round_interval = true
metric_batch_size = 1000
metric_buffer_limit = 10000
collection_jitter = "0s"
flush_interval = "10s"
flush_jitter = "0s"
precision = "0s"
hostname = ""
[[inputs.jti_openconfig_telemetry]]
servers = ["10.0.0.2:32767"] # <<< 设备IP
username = "root"
password = "Juniper"
client_id = "Switch"
tags = {location = "HKG"} # <<< 自定义tags
sample_frequency = "10000ms" # <<< 订阅上报频率,Juniper建议最小值为2s,根据实际情况调整
sensors = [
"/junos/system/linecard/cpu/memory",
"/junos/system/linecard/firewall/",
"/junos/system/linecard/interface/",
"/junos/system/linecard/interface/logical/usage",
"/junos/system/linecard/packet/usage/",
"/network-instances/network-instance[instance-name='name']/protocols/protocol/evpn/irb-interfaces/",
"/interfaces/interface/state/",
"/junos/system/linecard/environment",
"/junos/system/linecard/optics/",
]
collection_jitter = "0s"
flush_interval = "15s"
flush_jitter = "0s"
fielddrop = ["/interfaces/interface/subinterfaces/subinterface/ipv6/neighbors/neighbor/state/is-router"]
[[processors.regex]]
# 将匹配的文本替换为指定的文本,方便在granfan中设置变量,过滤指定接口
# 如果"/interfaces/interface/subinterfaces/subinterface/state/description"字段的值为空,将这个字段的值替换为 “-”
[[processors.regex.fields]]
key = "/interfaces/interface/subinterfaces/subinterface/state/description"
pattern = "^$"
replacement = "-"
[[outputs.prometheus_client]]
listen = ":9273"
path = "/metrics"
# Expiration interval for each metric. 0 == no expiration
expiration_interval = "5s"
export_timestamp = true # 将timestamp 附加到metrics, 让prometheus以该时间戳为准(不使用prometheus的job scrape时间戳)
metric_buffer_limit = 100000
[outputs.prometheus_client.tagpass]
location = ["HKG"]
方法二: 使用 inputs.gnmi
这个插件使用 gNMI Subscribe 方法的遥测数据。这个输入插件是 与供应商无关,并且在支持 gNMI 规范的任何平台上都受支持。
参考:https://github.com/influxdata/telegraf/tree/master/plugins/inputs/gnmi
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
# inputs.gnmi
[agent]
interval = "10s"
round_interval = true
metric_batch_size = 1000
metric_buffer_limit = 10000
collection_jitter = "0s"
flush_interval = "10s"
flush_jitter = "0s"
precision = "0s"
hostname = ""
[[inputs.gnmi]]
addresses = ["10.0.0.2:32767"] # <<< 设备IP
[[inputs.gnmi.subscription]]
name = "ifcounters"
origin = "openconfig-interfaces"
path = "/interfaces/interface/state/counters"
subscription_mode = "sample"
sample_interval = "10s"
[[inputs.gnmi.subscription]]
name = "component"
origin = "openconfig-system"
path = "/components/component/state/name"
subscription_mode = "sample"
sample_interval = "10s"
[[outputs.prometheus_client]]
listen = ":9273"
path = "/metrics"
expiration_interval = "5s" # metric过期时间. 0 == no expiration
export_timestamp = true # 将timestamp 附加到metrics, 让prometheus以该时间戳为准(不使用prometheus的job scrape时间戳)
一旦配置完成之后,启动 Telegraf 服务:
1
systemctl restart telegraf
查看sensors 的状态:
1
curl localhost:9273/metrics | grep sensors
Prometheus配置
1
2
3
4
5
6
- job_name: "JTI-HK1-1"
scrape_interval: 1s # <<< 必须小于telegraf中的 sample_frequency 值
honor_timestamps: true # 以metric自带的时间戳为准
honor_labels: true
static_configs:
- targets: ["localhost:9273"]
Grafana 查询接口流量
查询单个物理接口 xe-0/0/0:2, index=任意
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
# 查询单个物理接口 xe-0/0/0:2, index=任意
rate(
interfaces__interfaces_interface_subinterfaces_subinterface_state_counters_in_octets{
interfaces_interface__name=~"xe-0/0/0:2",
interfaces_interface_subinterfaces_subinterface__index=~".+",
system_id=~"HKG02.+"
} [$__rate_interval]
) *8
+ on (system_id, interfaces_interface__name, interfaces_interface_subinterfaces_subinterface__index)
group_left (interfaces_interface_subinterfaces_subinterface_state_description)
last_over_time(
interfaces__interfaces_interface_subinterfaces_subinterface_state_index{
interfaces_interface_subinterfaces_subinterface_state_oper_status="UP",
interfaces_interface__name=~"xe-0/0/0:2",
interfaces_interface_subinterfaces_subinterface__index=~".+",
interfaces_interface_subinterfaces_subinterface_state_description=~".+",
system_id=~"HKG02.+"
} [2m]
) *0
查询单个物理接口 et-0/1/5, 指定index, 例如:index = 536/538/541/552/568
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
# 查询单个物理接口 et-0/1/5, index=~"536|538|541|552|568"
-rate(
interfaces__interfaces_interface_subinterfaces_subinterface_state_counters_out_octets{
interfaces_interface__name=~"et-0/1/5",
interfaces_interface_subinterfaces_subinterface__index=~"536|538|541|552|568",
system_id=~"HKG02.+"
} [$__rate_interval]
) *8
+ on (system_id, interfaces_interface__name, interfaces_interface_subinterfaces_subinterface__index)
group_left (interfaces_interface_subinterfaces_subinterface_state_description)
last_over_time(
interfaces__interfaces_interface_subinterfaces_subinterface_state_index{
interfaces_interface_subinterfaces_subinterface_state_oper_status="UP",
interfaces_interface__name=~"et-0/1/5",
interfaces_interface_subinterfaces_subinterface__index=~"536|538|541|552|568",
interfaces_interface_subinterfaces_subinterface_state_description=~".+",
system_id=~"HKG02.+"
} [2m]
) *0
查询聚合接口 ae3.0,仅显示成员接口的速率(不显示聚合接口ae3.0)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
# 查询聚合接口ae3.0,仅显示成员接口的速率(不显示聚合接口ae3.0)
rate(
interfaces__interfaces_interface_subinterfaces_subinterface_state_counters_${direction}_octets{
interfaces_interface_subinterfaces_subinterface_state_parent_ae_name=~"ae3.0",
interfaces_interface_subinterfaces_subinterface__index=~"0",
system_id=~"SJC03.+"
} [$__rate_interval]
) *8
+ on (system_id, interfaces_interface_subinterfaces_subinterface__index)
group_left (interfaces_interface_subinterfaces_subinterface_state_description)
last_over_time(
interfaces__interfaces_interface_subinterfaces_subinterface_state_index{
interfaces_interface_subinterfaces_subinterface_state_oper_status="UP",
interfaces_interface__name=~"ae3",
interfaces_interface_subinterfaces_subinterface_state_name=~"ae3.0",
interfaces_interface_subinterfaces_subinterface__index=~"0",
#interfaces_interface_subinterfaces_subinterface_state_description=~".+",
system_id=~"SJC03.+"
} [2m]
) *0
查询聚合接口ae14.1,将所有成员接口速率相加(使用sum函数),最终显示聚合接口ae14.1的速率
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
# 查询聚合接口ae14.1,将所有成员接口速率相加(使用sum函数),最终显示聚合接口ae14.1的速率
sum(
rate(
interfaces__interfaces_interface_subinterfaces_subinterface_state_counters_out_octets{
interfaces_interface_subinterfaces_subinterface_state_parent_ae_name=~"ae14.1",
#interfaces_interface_subinterfaces_subinterface__index=~".+",
system_id=~"SJC04.+"
} [$__rate_interval]
) *8
+ on (system_id, interfaces_interface_subinterfaces_subinterface__index)
group_left (interfaces_interface_subinterfaces_subinterface_state_description)
last_over_time(
interfaces__interfaces_interface_subinterfaces_subinterface_state_index{
interfaces_interface_subinterfaces_subinterface_state_oper_status="UP",
interfaces_interface__name=~"ae14",
interfaces_interface_subinterfaces_subinterface_state_name=~"ae14.1",
#interfaces_interface_subinterfaces_subinterface__index=~".+",
#interfaces_interface_subinterfaces_subinterface_state_description=~".+",
system_id=~"SJC04.+"
} [2m]
) *0
)
by (
system_id,
interfaces_interface_subinterfaces_subinterface_state_parent_ae_name,
interfaces_interface_subinterfaces_subinterface_state_description
)
grafana图形显示 options - Legend
1
: () - out-
相关信息
https://github.com/Juniper/yang/tree/master/23.4/23.4R1.10/openconfig/models https://github.com/influxdata/telegraf/blob/master/plugins/inputs/jti_openconfig_telemetry/README.md https://github.com/influxdata/telegraf/blob/master/plugins/inputs/gnmi/README.mdtelegraf/plugins/inputs/gnmi at master · influxdata/telegraf