首页prometheus配置容器cpu使用率达百分之90告警

prometheus配置容器cpu使用率达百分之90告警

时间: 2024-10-22 14:13:38 浏览: 58

Prometheus 是一个开源的监控系统，可以通过配置告警规则来检测目标（如容器）的指标超限情况。要设置 CPU 使用率达到 90% 的告警，你需要在 Prometheus 配置文件 `prometheus.yml` 中定义一个新的警报规则。首先，在你的配置文件中找到 `alerting` 或 `scrape_configs` 部分。如果你还没有设置过告警规则，可能需要创建一个新的 `alert.rules` 部分。添加以下告警配置： ```yaml groups: - name: cpu_alerts rules: - alert: HighCPUUsage expr: container_cpu_usage{job="your_container_name"} > 90 for: 5m labels: severity: "page" annotations: summary: "Container {{ $labels.instance }} has high CPU usage." description: "The container {{ $labels.instance }} has been using more than 90% CPU for the last 5 minutes." - job_name: 'agent1' # 前面已经提到这部分，确认job_name对应的是被监控的容器 static_configs: - targets: ['10.1.1.14:9100'] # 替换为目标容器的 IP 和端口 # 其他配置... ``` 这里的 `container_cpu_usage{job="your_container_name"}` 是关键部分，你需要替换为实际监控的目标容器的 CPU 使用度的指标名称。`for: 5m` 指定当 CPU 使用率持续高于 90% 5分钟时触发告警。配置好后，保存并重启 Prometheus 以应用新的配置： 1. 到 prometheus 目录下运行 `pkill prometheus` 杀死当前进程。 2. 确保端口可用，如果不需要检查，直接跳到下一步。 3. 重启 Prometheus，使用命令：`/usr/local/prometheus/prometheus --config.file="/usr/local/prometheus/prometheus.yml"` 并后台运行 (`&`)。 4. 如果一切正常，可以在 Prometheus UI（默认端口9090）中查看告警状态，或者通过查询 API 来验证告警规则是否已生效。

阅读全文