Python YOLO视频目标识别
时间: 2023-11-05 08:53:16 浏览: 116
Python YOLO(You Only Look Once)是一种实时目标检测算法,可以用于视频目标识别。下面是一个简单的Python YOLO视频目标识别的示例代码:
首先,需要安装必要的库,包括opencv和darknet。可以使用以下命令安装:
```
!pip install opencv-python
!pip install darknet
```
然后,需要下载YOLO预训练模型和配置文件。可以从darknet官方网站下载,或者使用以下命令:
```
!wget https://pjreddie.com/media/files/yolov3.weights
!wget https://raw.githubusercontent.com/pjreddie/darknet/master/cfg/yolov3.cfg
!wget https://raw.githubusercontent.com/pjreddie/darknet/master/data/coco.names
```
接下来,可以使用以下代码进行视频目标识别:
```python
import cv2
import numpy as np
# load YOLO model and configurations
net = cv2.dnn.readNet("yolov3.weights", "yolov3.cfg")
# load class names
classes = []
with open("coco.names", "r") as f:
classes = [line.strip() for line in f.readlines()]
# set input size
input_size = (416, 416)
# open video stream
cap = cv2.VideoCapture("video.mp4")
while True:
# read frame
ret, frame = cap.read()
if not ret:
break
# preprocess frame
blob = cv2.dnn.blobFromImage(frame, 1/255, input_size, swapRB=True)
net.setInput(blob)
# forward pass
outs = net.forward(net.getUnconnectedOutLayersNames())
# postprocess detections
class_ids = []
confidences = []
boxes = []
for out in outs:
for detection in out:
scores = detection[5:]
class_id = np.argmax(scores)
confidence = scores[class_id]
if confidence > 0.5:
center_x = int(detection[0] * frame.shape[1])
center_y = int(detection[1] * frame.shape[0])
width = int(detection[2] * frame.shape[1])
height = int(detection[3] * frame.shape[0])
left = center_x - width // 2
top = center_y - height // 2
class_ids.append(class_id)
confidences.append(float(confidence))
boxes.append([left, top, width, height])
# apply non-maximum suppression
indices = cv2.dnn.NMSBoxes(boxes, confidences, 0.5, 0.4)
# draw detections
for i in indices:
i = i[0]
box = boxes[i]
left = box[0]
top = box[1]
width = box[2]
height = box[3]
class_id = class_ids[i]
label = classes[class_id]
confidence = confidences[i]
color = (0, 255, 0)
cv2.rectangle(frame, (left, top), (left+width, top+height), color, 2)
cv2.putText(frame, f"{label}: {confidence:.2f}", (left, top-10), cv2.FONT_HERSHEY_SIMPLEX, 0.5, color, 2)
# show frame
cv2.imshow("Video", frame)
if cv2.waitKey(1) == ord('q'):
break
# release resources
cap.release()
cv2.destroyAllWindows()
```
这个示例代码会打开名为“video.mp4”的视频文件,并对其中的每个帧进行目标检测和识别,并在每个检测到的目标周围绘制一个矩形框,并在矩形框上方显示目标类别和置信度。可以使用“q”键退出程序。
阅读全文