###配置Prometheus监控和通知

https://github.com/prometheus/prometheus
编辑配置,监听对应的服务端口,如caddy/nacos
./prometheus.yml
  • 接入Alertmanager警报跟踪 文档

# 下载被安装,运行
https://github.com/prometheus/alertmanager
# 修改Prometheus的配置文件prometheus.yml,把alerting接管到9093
alerting:
  alertmanagers:
    - static_configs:
        - targets:
          - 172.17.0.3:9093
  • 接入PrometheusAlert通知

# 下载并安装,配置对应的钉钉配置等
https://github.com/feiyu563/PrometheusAlert
# 修改Alertmanager警报跟踪的配置文件alertmanager.yml
global:
  resolve_timeout: 5m
route:
  group_by: ['instance']
  group_wait: 10m
  group_interval: 10s
  repeat_interval: 10m
  receiver: 'web.hook.prometheusalert'
receivers:
- name: 'web.hook.prometheusalert'
  webhook_configs:
  - url: 'http://[prometheusalert_url]:8080/prometheusalert?type=dd&tpl=prometheus-dd&ddurl=https://oapi.dingtalk.com/robot/send?access_token=xxxxxxxxxxxxxxxxxxxxxx&at=18888888888'
  • 创建caddy_error.yml新文件,并配置rule_files参数,进行监控触发

rule_files:
  - "caddy_error.yml"
groups:
- name: caddy_error
  rules:
  - alert: HighErrorRate
    expr: caddy_http_requests_total{handler="file_server", instance="172.17.0.1:2019", job="caddy", server="srv0"} > 10
    for: 10m
    labels:
      severity: page
    annotations:
      summary: High request latency
      description: description info
c