当前位置:网站首页>Yolov3 target detection
Yolov3 target detection
2022-06-22 10:40:00 【liu-Mr】
List of articles
Preface
Always right opencv Very interested , I think its ability is really strong . and yolo It's like a thunderbolt. I've thought about learning for a long time , Just today, I had a video of this kind , Just learn by the way .
Reference resources :
python And C++
Maple 333
One 、 Preparation
download yolov3 Related documents : Download link
olov3.weights The file contains the pre trained network weights ;
olov3.weights download
yolov3.cfg File contains Network configuration ;
yolov3.cfg download
coco.names The file contains COCO In dataset 80 Different aliases .
coco.names download
Be careful : Put these three files in the same directory as the program files , Otherwise, you need to change the import address of these three files .
Two 、 Use steps
1. Import and stock in
import cv2
import numpy as np
install cv library :
pip install opencv-python -i https://pypi.tuna.tsinghua.edu.cn/simple
2. Initialize configuration
coco.names It contains the object class name during model training . The file is read first .
The network consists of two parts :
yolov3.weights - Model weight of pre training
yolov3.cfg - Network profile
Set up DNN The back end is OpenCV , The target is set to CPU. Can also be set to cv.dnn.DNN_TARGET_OPENCL In the GPU Up operation . Pay attention to the present OpenCV Version support only Intel Of GPUs test .
# Get the camera or video address
cap = cv2.VideoCapture(r"./data/test.mp4")
# coco.names The file stores 80 The name of a trained recognition type , And these category names are exactly the same as yolo What you're training 80 The categories correspond to each other
classesFile = r"coco.names"
# Storage type name list
classNames = []
with open(classesFile, "rt") as f:
# Read data by row
classNames = f.read().splitlines()
# Show all type names
print(classNames)
# To configure yolov3
modelConfiguration = "yolov3.cfg" # The configuration file
modelWeights = "yolov3.weights" # Configure the weight file
net = cv2.dnn.readNetFromDarknet(modelConfiguration, modelWeights) # Add the configuration file to dnn In the network
net.setPreferableBackend(cv2.dnn.DNN_BACKEND_OPENCV) # take DNN The back end is set to opencv
net.setPreferableTarget(cv2.dnn.DNN_TARGET_CPU) # take DNN Front end setting assembly cpu drive
3. Setting up network
The input image of the neural network needs to be blob Specific format organization of .
After reading a frame of picture , It needs to go through blobFromImage Function processing , To convert to a network input blob. In this process , Picture pixel value adopted 1/255 Scale the factor of to [0, 1] Range ; And without cutting , Resize the picture to (416, 416).
notes : There is no de averaging operation , therefore , The mean parameter of the function is [0, 0, 0]. Set the default size of the input picture of the network :inpWidth and inpHeight. Here are all set to 416, Can also be set to 320 To get faster , Set to 608 For better accuracy .
Output after input graph processing blob, As network input , Do forward calculation , To get a list of the output forecast bounding boxes . The prediction frame output from the network is post processed , To filter low confidence bounding boxes .
OpenCV Of Net Class forward The function needs to know the final output layer of the network .
Because you want to run the entire network , therefore , Need to confirm the last layer of the network . use getUnconnectedOutLayers() Function to get the name of the connectionless output layer , These layers are generally the output layer of the network .
while True:
# Reading data
success, frame = cap.read()
# DNN The input image of the network needs to adopt a method called blob Specific format for
blob = cv2.dnn.blobFromImage(frame, 1 / 255, (inpWidth, inpHeight), [0, 0, 0], True, False)
# Will output blob As input to the incoming network
net.setInput(blob)
# Get the name of the input layer
layerNames = net.getLayerNames()
# Get the last layer of the input layer , To traverse the entire network
outputNames = [layerNames[i - 1] for i in net.getUnconnectedOutLayers()]
outputs = net.forward(outputNames)
findObjects(outputs, frame)
# Display images
cv2.imshow("img", frame)
# Wait for the button ESC
if cv2.waitKey(10) == 27:
break
# Free memory
cap.release()
cv2.destroyAllWindows()
4. Frame processing
Each bounding box of the network output is represented as Class alias + 5 Vectors of elements .
Front of vector 4 The elements are :center_x , center_y , width and height.
The first 5 Elements represent the confidence level of the bounding box containing the object .
The remaining elements are the confidence levels associated with each category ( probability ). The bounding box is assigned to the category corresponding to the highest score . box The highest score of is also called confidence confidence. If box The confidence of is below a given threshold , The bounding box is discarded , No further post-treatment .
The confidence is greater than or equal to the given confidence threshold boxes, Will be carried out in NMS Further treatment , To reduce overlap boxes The number of .
If nmsThreshold The parameter is too small , Such as 0.1, Overlapping objects of the same or different categories may not be detected .
If nmsThreshold The parameter is too large , Such as ,1, You will get multiple frames of the same object .
# yolov3 Detect and handle
def findObjects(outputs, img):
hT, wT, cT = img.shape # Get the size of the original frame image H,W
bbox = [] # Create a list of coordinates to store a priori boxes
classIds = [] # Create and store the category information name detected in each frame
confs = [] # Create confidence values read per frame
for output in outputs: # Traverse all categories
for det in output: # testing frame Each category in the frame
scores = det[5:] # Get the category and 80 The probability of similarity of all categories
classId = np.argmax(scores) # get 80 The most similar category in the ( The category with the largest similar probability value ) The subscript
confidence = scores[classId] # Get the value of the maximum similarity probability
if confidence > confThreshold: # Judge the similarity threshold
# Get the four coordinate points of the a priori box
w, h = int(det[2] * wT), int(det[3] * hT)
x, y = int((det[0] * wT) - w / 2), int((det[1] * hT) - h / 2)
bbox.append([x, y, w, h]) # Add coordinates to bbox For storage , Convenient for frame A priori frame coordinates of all categories in the frame are stored
classIds.append(classId) # take frame The number corresponding to each category in (1-80), It is convenient to output text , Corresponding to coconame Output the category name in the file
confs.append(float(confidence)) # Yes frame The maximum suppression of each type of information identified by the parameter nms Threshold control
# Yes frame The maximum suppression of each type of information identified by the parameter nms Threshold control
indices = cv2.dnn.NMSBoxes(bbox, confs, confThreshold, nmsThreshold)
for i in indices:
box = bbox[i] # Read the maximum known parameters in sequence nms A priori frame coordinates of the threshold
x, y, w, h = box[0], box[1], box[2], box[3]
# print(x,y,w,h)
# Rectangular box selection shall be conducted for each final identified target
cv2.rectangle(img, (x, y), (x + w, y + h), (255, 0, 255), 2)
# Corresponding coco.names The corresponding category name and similarity probability are output
cv2.putText(img, f'{
classNames[classIds[i]].capitalize()} {
int(confs[i] * 100)}%',
(x, y - 10), cv2.FONT_HERSHEY_SIMPLEX, 0.6, (255, 0, 255), 2)
Complete code :
import cv2
import numpy as np
# Get the camera or video address
cap = cv2.VideoCapture(r"./data/test.mp4")
# Recognition confidence threshold
confThreshold = 0.5
# Maximum suppression value
nmsThreshold = 0.2
# The width and height of the network input image
inpWidth = 320
inpHeight = 320
# coco.names The file stores 80 The name of a trained recognition type , And these category names are exactly the same as yolo What you're training 80 The categories correspond to each other
classesFile = r"coco.names"
# Storage type name list
classNames = []
with open(classesFile, "rt") as f:
# Read data by row
classNames = f.read().splitlines()
# Show all type names
print(classNames)
# To configure yolov3
modelConfiguration = "yolov3.cfg" # The configuration file
modelWeights = "yolov3.weights" # Configure the weight file
net = cv2.dnn.readNetFromDarknet(modelConfiguration, modelWeights) # Add the configuration file to dnn In the network
net.setPreferableBackend(cv2.dnn.DNN_BACKEND_OPENCV) # take DNN The back end is set to opencv
net.setPreferableTarget(cv2.dnn.DNN_TARGET_CPU) # take DNN Front end setting assembly cpu drive
# yolov3 Detect and handle
def findObjects(outputs, img):
hT, wT, cT = img.shape # Get the size of the original frame image H,W
bbox = [] # Create a list of coordinates to store a priori boxes
classIds = [] # Create and store the category information name detected in each frame
confs = [] # Create confidence values read per frame
for output in outputs: # Traverse all categories
for det in output: # testing frame Each category in the frame
scores = det[5:] # Get the category and 80 The probability of similarity of all categories
classId = np.argmax(scores) # get 80 The most similar category in the ( The category with the largest similar probability value ) The subscript
confidence = scores[classId] # Get the value of the maximum similarity probability
if confidence > confThreshold: # Judge the similarity threshold
# Get the four coordinate points of the a priori box
w, h = int(det[2] * wT), int(det[3] * hT)
x, y = int((det[0] * wT) - w / 2), int((det[1] * hT) - h / 2)
bbox.append([x, y, w, h]) # Add coordinates to bbox For storage , Convenient for frame A priori frame coordinates of all categories in the frame are stored
classIds.append(classId) # take frame The number corresponding to each category in (1-80), It is convenient to output text , Corresponding to coconame Output the category name in the file
confs.append(float(confidence)) # Yes frame The maximum suppression of each type of information identified by the parameter nms Threshold control
# Yes frame The maximum suppression of each type of information identified by the parameter nms Threshold control
indices = cv2.dnn.NMSBoxes(bbox, confs, confThreshold, nmsThreshold)
for i in indices:
box = bbox[i] # Read the maximum known parameters in sequence nms A priori frame coordinates of the threshold
x, y, w, h = box[0], box[1], box[2], box[3]
# print(x,y,w,h)
# Rectangular box selection shall be conducted for each final identified target
cv2.rectangle(img, (x, y), (x + w, y + h), (255, 0, 255), 2)
# Corresponding coco.names The corresponding category name and similarity probability are output
cv2.putText(img, f'{
classNames[classIds[i]].capitalize()} {
int(confs[i] * 100)}%',
(x, y - 10), cv2.FONT_HERSHEY_SIMPLEX, 0.6, (255, 0, 255), 2)
while True:
# Reading data
success, frame = cap.read()
# DNN The input image of the network needs to adopt a method called blob Specific format for
blob = cv2.dnn.blobFromImage(frame, 1 / 255, (inpWidth, inpHeight), [0, 0, 0], True, False)
# Will output blob As input to the incoming network
net.setInput(blob)
# Get the name of the input layer
layerNames = net.getLayerNames()
# Get the last layer of the input layer , To traverse the entire network
outputNames = [layerNames[i - 1] for i in net.getUnconnectedOutLayers()]
outputs = net.forward(outputNames)
findObjects(outputs, frame)
# Display images
cv2.imshow("img", frame)
# Wait for the button ESC
if cv2.waitKey(10) == 27:
break
# Free memory
cap.release()
cv2.destroyAllWindows()
result :
summary
yolov3 There are many other functions for target detection , Share more interesting code in the future , This is a beginner's article , Basically, the official code has not been modified or added . Please forgive me .
边栏推荐
- Cobalt strike from entry to imprisonment (III)
- Backbone! Youxuan software was selected as one of the top 100 digital security companies in China in 2022
- Niuke.com Huawei question bank (31~40)
- Spark精简面试
- 今天,SysAK 是如何实现业务抖动监控及诊断?&手把手带你体验Anolis OS|第25-26期
- IPO配置指南
- Laravel 中类似 WordPress 的钩子和过滤器
- Zero‐Copy API
- Duxiaoyong, head of virtual teaching and Research Office of database course: Based on the major needs of domestic databases, explore a new mode of course system construction
- 定金预售的规则思路详解
猜你喜欢

TCP建立连接过程(深入源码理解3次握手)

heidisql插入记录,总是出错,要怎么改?

Former amd chip architect roast said that the cancellation of K12 processor project was because amd counseled!

YOLOv3目标检测

Spark streamlined interview

Don't be silly enough to distinguish hash, chunkhash and contenthash

Thinkphp3.2.3 log inclusion analysis

thinkphp3.2.3日志包含分析

论文精读:Generative Adversarial Imitation Learning(生成对抗模仿学习)

10-2xxe vulnerability principle and case experiment demonstration
随机推荐
缓存穿透利器之「布隆过滤器」
数据库课程虚拟教研室负责人杜小勇:立足国产数据库重大需求,探索课程体系建设新模式
10-2xxe漏洞原理和案例实验演示
Zero‐Copy API
Foreign trade topic: foreign trade e-mail marketing template
How harmful is the code signature certificate once it is leaked
等重构完这系统,我就提离职!
2022-06-09 work record --yarn/npm-error-eperm: operation not permitted, UV_ cwd
nodejs 中 path.join() 和 path.resolve()的区别
Thinkphp3.2.3 log inclusion analysis
[deep learning] great! The new attention makes the model 2-4 times faster!
WordPress like hooks and filters in laravel
iNFTnews | 观点:市场降温或是让NFT应用走向台前的机会
AttributeError: module ‘skimage. draw‘ has no attribute ‘circle‘
Signal integrity (SI) power integrity (PI) learning notes (XXIV) differential pair and differential impedance (IV)
【JMeter】JMeter如何模拟不同的网络速度
Using computed columns in laravel
Quel est le risque de divulgation d'un certificat de signature de code?
[Jenkins] shell script calls Jenkins API interface
Vs2022 connecting to SQLSERVER database tutorial