ubuntu20.04如何使用cadvisor来使prometheus采集到docker容器中的gpu信息,以及gpu在宿主机中使用率的占比,请详细列出各个步骤需要的代码,并且创建一个grafana仪表盘的json文件来展示采集到的信息
时间: 2024-03-28 09:38:23 浏览: 84
dockprom:使用Prometheus,Grafana,cAdvisor,NodeExporter和AlertManager监控Docker主机和容器
好的,以下是详细的步骤和代码,以及grafana仪表盘的json文件:
步骤1:安装cadvisor和prometheus
```
# 安装cadvisor
sudo docker run \
--volume=/:/rootfs:ro \
--volume=/var/run:/var/run:rw \
--volume=/sys:/sys:ro \
--volume=/var/lib/docker/:/var/lib/docker:ro \
--publish=8080:8080 \
--detach=true \
--name=cadvisor \
google/cadvisor:latest
# 安装prometheus
wget https://github.com/prometheus/prometheus/releases/download/v2.23.0/prometheus-2.23.0.linux-amd64.tar.gz
tar xvfz prometheus-2.23.0.linux-amd64.tar.gz
cd prometheus-2.23.0.linux-amd64/
```
步骤2:配置prometheus.yml文件,将cadvisor添加为一个job
```
# 编辑prometheus.yml文件
vi prometheus.yml
# 添加以下内容
scrape_configs:
- job_name: 'cadvisor'
static_configs:
- targets: ['localhost:8080']
```
步骤3:启动prometheus
```
./prometheus --config.file=prometheus.yml
```
步骤4:在grafana中添加prometheus数据源
- 点击“Configuration” -> “Data Sources” -> “Add data source”
- 选择“Prometheus”作为类型
- 在URL栏中输入“http://localhost:9090”(默认prometheus端口)
- 点击“Save & Test”
步骤5:创建grafana仪表盘
- 点击“Create” -> “Dashboard”
- 选择“Add Query” -> “Query” -> “Prometheus”
- 在query栏中输入以下内容来获取GPU的使用率:
```
rate(container_gpu_usage{name=~".+"}[1m])
```
- 在query栏中输入以下内容来获取GPU在宿主机中的使用率:
```
sum(node_gpu_utilisation)/count(node_gpu_utilisation)
```
- 在右上角选择“Graph”,并将查询结果添加到仪表盘中
- 点击“Save Dashboard”,并将其命名为“GPU Usage”
最后,以下是grafana仪表盘的json文件:
```
{
"dashboard": {
"id": null,
"title": "GPU Usage",
"timezone": "browser",
"rows": [
{
"title": "GPU Usage",
"height": "300px",
"panels": [
{
"legend": {
"show": true
},
"targets": [
{
"expr": "rate(container_gpu_usage{name=~\".+\"}[1m])",
"format": "time_series",
"interval": "",
"intervalFactor": 1,
"refId": "A"
}
],
"title": "GPU Usage",
"type": "graph",
"yaxes": [
{
"format": "percent",
"label": null,
"logBase": 1,
"max": null,
"min": null,
"show": true
},
{
"format": "short",
"label": null,
"logBase": 1,
"max": null,
"min": null,
"show": true
}
],
"xaxis": {
"buckets": null,
"mode": "time",
"name": null,
"show": true,
"values": []
},
"yaxis": {
"align": false,
"alignLevel": null
},
"gridPos": {
"h": 8,
"w": 12,
"x": 0,
"y": 0
}
},
{
"legend": {
"show": true
},
"targets": [
{
"expr": "sum(node_gpu_utilisation)/count(node_gpu_utilisation)",
"format": "time_series",
"interval": "",
"intervalFactor": 1,
"refId": "A"
}
],
"title": "GPU Usage in Host",
"type": "graph",
"yaxes": [
{
"format": "percent",
"label": null,
"logBase": 1,
"max": null,
"min": null,
"show": true
},
{
"format": "short",
"label": null,
"logBase": 1,
"max": null,
"min": null,
"show": true
}
],
"xaxis": {
"buckets": null,
"mode": "time",
"name": null,
"show": true,
"values": []
},
"yaxis": {
"align": false,
"alignLevel": null
},
"gridPos": {
"h": 8,
"w": 12,
"x": 0,
"y": 8
}
}
]
}
],
"schemaVersion": 22,
"version": 0
},
"folderId": null,
"overwrite": false
}
```
阅读全文