Pod使用率用prometheus怎么监控告警
时间: 2023-09-10 09:14:15 浏览: 217
可以通过以下步骤使用 Prometheus 监控 Pod 使用率并设置告警:
步骤1:在 Kubernetes 环境中安装 Prometheus。可以通过 Helm Chart 安装 Prometheus。
步骤2:启用 Prometheus 的 kubelet 属性监控。在 Prometheus 配置文件中添加以下内容:
```
- job_name: kubelet
kubernetes_sd_configs:
- role: node
relabel_configs:
- source_labels: [__meta_kubernetes_node_label_kubernetes_io_hostname]
target_label: kubernetes_io_hostname
- action: labelmap
regex: __meta_kubernetes_node_label_(.+)
- action: replace
source_labels: [__meta_kubernetes_node_name]
target_label: instance
- job_name: 'kubelet'
static_configs:
- targets: ['<kubelet_ip>:10255']
labels:
node_exporter_port: 10255
```
步骤3:创建 Prometheus 监控指标。在 Prometheus 配置文件中添加以下内容:
```
- job_name: 'kubernetes-pods'
scrape_interval: 30s
metrics_path: /metrics
kubernetes_sd_configs:
- role: pod
relabel_configs:
- source_labels: [__meta_kubernetes_pod_container_name]
regex: '.*'
action: keep
- source_labels: [__meta_kubernetes_namespace]
target_label: kubernetes_namespace
- source_labels: [__meta_kubernetes_pod_name]
target_label: kubernetes_pod_name
- source_labels: [__meta_kubernetes_pod_container_name]
target_label: container_name
- source_labels: [__meta_kubernetes_pod_container_port_name]
target_label: container_port_name
- source_labels: [__meta_kubernetes_pod_container_port_number]
target_label: container_port_number
- source_labels: [__meta_kubernetes_pod_label_app]
target_label: app
relabel_configs:
- source_labels: [__address__, __meta_kubernetes_pod_annotation_prometheus_io_scrape]
action: keep
regex: true
- source_labels: [__address__, __meta_kubernetes_pod_annotation_prometheus_io_port]
action: replace
target_label: __metrics_path__
regex: (.+)
- source_labels: [__meta_kubernetes_namespace]
target_label: kubernetes_namespace
- source_labels: [__meta_kubernetes_pod_name]
target_label: kubernetes_pod_name
```
步骤4:配置 Prometheus 告警规则。在 Prometheus 配置文件中添加以下内容:
```
groups:
- name: pod-usage-rules
rules:
- alert: HighPodCPUUsage
expr: sum(rate(container_cpu_usage_seconds_total{namespace="<namespace>", pod="<pod_name>", container="<container_name>"}[5m])) by (namespace, pod, container) > 1
for: 5m
labels:
severity: warning
annotations:
summary: "High CPU usage on pod"
description: "{{ $labels.namespace }}/{{ $labels.pod }}/{{ $labels.container }} is using too much CPU."
```
步骤5:重新加载 Prometheus 配置文件以应用更改。
以上就是使用 Prometheus 监控 Pod 使用率并设置告警的步骤。需要根据实际情况进行修改和调整。
阅读全文