Pod使用率用prometheus怎么监控告警
时间: 2023-09-12 15:12:31 浏览: 200
prometheus rules 告警配置, 包含主机、pod、k8s组件等
可以使用Prometheus来监控Pod的使用率,并设置告警规则来触发警报。以下是一些步骤:
1. 安装Prometheus Operator
```
kubectl create namespace monitoring
helm install prometheus-operator stable/prometheus-operator --namespace monitoring
```
2. 安装Node Exporter
```
kubectl apply -f https://raw.githubusercontent.com/prometheus/node_exporter/master/examples/k8s/node-exporter.yaml
```
3. 配置Prometheus
在Prometheus的配置文件中,添加以下规则:
```
- name: kubernetes-pods
kubernetes_sd_configs:
- role: pod
relabel_configs:
- source_labels: [__meta_kubernetes_pod_container_name]
action: keep
regex: '.*'
- action: labelmap
regex: __meta_kubernetes_pod_(.+)
- source_labels: [__meta_kubernetes_namespace]
action: replace
target_label: kubernetes_namespace
- source_labels: [__meta_kubernetes_pod_name]
action: replace
target_label: kubernetes_pod_name
- source_labels: [__address__, __meta_kubernetes_pod_container_port_name]
action: replace
target_label: __address__
regex: (.+):(?:\d+);(?:.+)
replacement: $1:9100
- source_labels: [__address__]
action: replace
target_label: instance
```
4. 配置告警规则
在Prometheus的配置文件中,添加以下告警规则:
```
groups:
- name: example
rules:
- alert: HighPodUsage
expr: sum(rate(container_cpu_usage_seconds_total{container_name!="POD"}[5m])) by (kubernetes_pod_name) / sum(container_spec_cpu_quota{container_name!="POD"}) by (kubernetes_pod_name) > 0.8
for: 5m
labels:
severity: critical
annotations:
summary: "High pod CPU usage (instance {{ $labels.instance }})"
description: "{{ $labels.kubernetes_namespace }}/{{ $labels.kubernetes_pod_name }} is using more than 80% of the CPU allocated to it."
```
5. 在Prometheus中启用告警规则
在Prometheus的配置文件中,添加以下配置:
```
rule_files:
- /etc/prometheus/rules/*.rules
```
6. 设置警报接收者
可以使用Alertmanager来设置警报接收者。在Alertmanager的配置文件中,添加以下配置:
```
receivers:
- name: 'slack-notifications'
slack_configs:
- api_url: '<your_slack_webhook_url>'
channel: '#alerts'
send_resolved: true
```
7. 启用Alertmanager
在Prometheus的配置文件中,添加以下配置:
```
alerting:
alertmanagers:
- static_configs:
- targets: ['alertmanager:9093']
```
通过以上步骤,可以使用Prometheus监控Pod的使用率,并设置告警规则来触发警报。如果Pod的CPU使用率超过80%,将会触发警报,并发送到Slack通知接收者。
阅读全文