最新消息:雨落星辰是一个专注网站SEO优化、网站SEO诊断、搜索引擎研究、网络营销推广、网站策划运营及站长类的自媒体原创博客

rabbitmq prometheus alerting - getting alerts even after there is no data in rabbitmq - Stack Overflow

programmeradmin2浏览0评论

I am using prometheus alerting for rabbitmq. Below is the configuration I am using.

prometheus.yml

global:
  scrape_interval: 15s # Set the scrape interval to every 15 seconds. Default is every 1 minute.
  evaluation_interval: 5m # Evaluate rules every 15 seconds. The default is every 1 minute.
  # scrape_timeout is set to the global default (10s).
alerting:
   alertmanagers:
       - static_configs:
           - targets:
               - ip:port
rule_files:
- "alerts_rules.yml"
scrape_configs:
- job_name: "prometheus"
  static_configs:
  - targets: ["ip:port"]

alerts_rules.yml

groups:
- name: instance_alerts
  rules:
  - alert: "Instance Down"
    expr: up == 0
    for: 30s
    # keep_firing_for: 30s
    labels:
      severity: "Critical"
    annotations:
      summary: "Endpoint {{ $labels.instance }} down"
      description: "{{ $labels.instance }} of job {{ $labels.job }} has been down for more than 30 sec."

- name: rabbitmq_alerts
  rules:
    - alert: "Consumer down for last 1 min"
      expr: rabbitmq_queue_consumers == 0
      for: 30s
      # keep_firing_for: 30s
      labels:
        severity: Critical
      annotations:
        summary: "shortify | '{{ $labels.queue }}' has no consumers"
        description: "The queue '{{ $labels.queue }}' in vhost '{{ $labels.vhost }}' has zero consumers for more than 30 sec. Immediate attention is required."


    - alert: "Total Messages > 10k in last 1 min"
      expr: rabbitmq_queue_messages > 10000
      for: 30s
      # keep_firing_for: 30s
      labels:
        severity: Critical
      annotations:
        summary: "'{{ $labels.queue }}' has total '{{ $value }}' messages for more than 1 min."
        description: |
          Queue {{ $labels.queue }} in RabbitMQ has total {{ $value }} messages for more than 1 min.

Even if there is no data in queue, it sends me alerts as I have kept evaluation_interval: 5m (Prometheus evaluates alert rules every 5 minutes) and for: 30s (Ensures the alert fires only if the condition persists for 30s).
I guess for: 30s is not working for me.
By the way i am not using alertmanager, i am just using prometheus

How can i solve this. Thank you in advance.

发布评论

评论列表(0)

  1. 暂无评论