当前位置:网站首页>yolov5 提速多GPU训练显存低的问题
yolov5 提速多GPU训练显存低的问题
2022-06-25 22:13:00 【所向披靡的张大刀】
yolov5多GPU训练显存低
修改前:
按照配置,在train.py配置如下:
运行 python train.py 后nvidia-smi 显示显存占用如下:
修改后
参考yolov5 官方中的issue中,有人提到的分布式多进程的方法:
在yolov5运行的虚拟环境下,找到torch的distributed 的环境:比如我的在conda3/envs/rcnn/lib/python3.6/site-packages/torch/distributed/;
在distributed文件下,新建多进程的脚本,命名为yolov5_launch.py:
import sys
import subprocess
import os
from argparse import ArgumentParser, REMAINDER
def parse_args():
""" Helper function parsing the command line options @retval ArgumentParser """
parser = ArgumentParser(description="PyTorch distributed training launch "
"helper utility that will spawn up "
"multiple distributed processes")
# Optional arguments for the launch helper
parser.add_argument("--nnodes", type=int, default=1,
help="The number of nodes to use for distributed "
"training")
parser.add_argument("--node_rank", type=int, default=0,
help="The rank of the node for multi-node distributed "
"training")
parser.add_argument("--nproc_per_node", type=int, default=2,
help="The number of processes to launch on each node, "
"for GPU training, this is recommended to be set "
"to the number of GPUs in your system so that "
"each process can be bound to a single GPU.")#修改成你对应GPU的个数
parser.add_argument("--master_addr", default="127.0.0.1", type=str,
help="Master node (rank 0)'s address, should be either "
"the IP address or the hostname of node 0, for "
"single node multi-proc training, the "
"--master_addr can simply be 127.0.0.1")
parser.add_argument("--master_port", default=29528, type=int,
help="Master node (rank 0)'s free port that needs to "
"be used for communication during distributed "
"training")
parser.add_argument("--use_env", default=False, action="store_true",
help="Use environment variable to pass "
"'local rank'. For legacy reasons, the default value is False. "
"If set to True, the script will not pass "
"--local_rank as argument, and will instead set LOCAL_RANK.")
parser.add_argument("-m", "--module", default=False, action="store_true",
help="Changes each process to interpret the launch script "
"as a python module, executing with the same behavior as"
"'python -m'.")
parser.add_argument("--no_python", default=False, action="store_true",
help="Do not prepend the training script with \"python\" - just exec "
"it directly. Useful when the script is not a Python script.")
# # positional
# parser.add_argument("training_script", type=str,default=r"train,py"
# help="The full path to the single GPU training "
# "program/script to be launched in parallel, "
# "followed by all the arguments for the "
# "training script")
# # rest from the training program
# parser.add_argument('training_script_args', nargs=REMAINDER)
return parser.parse_args()
def main():
args = parse_args()
args.training_script = r"yolov5-master/train.py"#修改成你要训练的train.py的绝对路径
# world size in terms of number of processes
dist_world_size = args.nproc_per_node * args.nnodes
# set PyTorch distributed related environmental variables
current_env = os.environ.copy()
current_env["MASTER_ADDR"] = args.master_addr
current_env["MASTER_PORT"] = str(args.master_port)
current_env["WORLD_SIZE"] = str(dist_world_size)
processes = []
if 'OMP_NUM_THREADS' not in os.environ and args.nproc_per_node > 1:
current_env["OMP_NUM_THREADS"] = str(1)
print("*****************************************\n"
"Setting OMP_NUM_THREADS environment variable for each process "
"to be {} in default, to avoid your system being overloaded, "
"please further tune the variable for optimal performance in "
"your application as needed. \n"
"*****************************************".format(current_env["OMP_NUM_THREADS"]))
for local_rank in range(0, args.nproc_per_node):
# each process's rank
dist_rank = args.nproc_per_node * args.node_rank + local_rank
current_env["RANK"] = str(dist_rank)
current_env["LOCAL_RANK"] = str(local_rank)
# spawn the processes
with_python = not args.no_python
cmd = []
if with_python:
cmd = [sys.executable, "-u"]
if args.module:
cmd.append("-m")
else:
if not args.use_env:
raise ValueError("When using the '--no_python' flag, you must also set the '--use_env' flag.")
if args.module:
raise ValueError("Don't use both the '--no_python' flag and the '--module' flag at the same time.")
cmd.append(args.training_script)
if not args.use_env:
cmd.append("--local_rank={}".format(local_rank))
# cmd.extend(args.training_script_args)
process = subprocess.Popen(cmd, env=current_env)
processes.append(process)
for process in processes:
process.wait()
if process.returncode != 0:
raise subprocess.CalledProcessError(returncode=process.returncode,
cmd=cmd)
if __name__ == "__main__":
# import os
# os.environ['CUDA_VISIBLE_DEVICES'] = "0,1"
main()
运行上述脚本: python yolov5_launch.py

显存占用超过80%,注意这里可以将train.py 配置里面的batch_size 调大;
另外一种方法
在网上看到另外一种方法,是不用在distributed文件夹下面新建文件这样麻烦,在
python -m torch.distributed.launch --nproc_per_node 2 train.py --batch-size 64 --data data/Allcls_one.yaml --weights weights/yolov5l.pt --cfg models/yolov5l_1cls.yaml --epochs 1 --device 0,1
训练时,在python后面加上-m torch.distributed.launch --nproc_per_node (修改成你的gpu的个数)再运行train.py 再后面加上各种配置文件
这个方法亲测可行,比第一种方法简单有效!
边栏推荐
- About the solution to prompt modulenotfounderror: no module named'pymongo 'when running the scratch project
- Simulation connection between WinCC and STEP7_ Old bear passing by_ Sina blog
- 10.4.1 données intermédiaires
- Hand made pl-2303hx USB to TTL level serial port circuit_ Old bear passing by_ Sina blog
- ValueError: color kwarg must have one color per data set. 9 data sets and 1 colors were provided
- 在同一台机器上部署OGG并测试
- 博图软件中多重背景块的建立_过路老熊_新浪博客
- 86.(cesium篇)cesium叠加面接收阴影效果(gltf模型)
- 6. common instructions (upper) v-cloak, v-once, v-pre
- Literature research (I): hourly energy consumption prediction of office buildings based on integrated learning and energy consumption pattern classification
猜你喜欢

文献调研(三):数据驱动的建筑能耗预测模型综述
![[wechat official account H5] generates a QR code with parameters to enter the official account attention page to listen to user-defined menu bar for official account events (server)](/img/d9/935bad29005e5846dc514c966e3b0e.png)
[wechat official account H5] generates a QR code with parameters to enter the official account attention page to listen to user-defined menu bar for official account events (server)

Notes on the method of passing items from the spider file to the pipeline in the case of a scratch crawler

Common problems encountered when creating and publishing packages using NPM

ASA如何配置端口映射及PAT

Circuit de fabrication manuelle d'un port série de niveau USB à TTL pour PL - 2303hx Old bear passing Sina blog

安装PSU的时候/usr/bin/ld:warning: -z lazyload ignore

兆欧表电压档位选择_过路老熊_新浪博客

Literature research (II): quantitative evaluation of building energy efficiency performance based on short-term energy prediction

手工制作 pl-2303hx 的USB转TTL电平串口的电路_过路老熊_新浪博客
随机推荐
Record a simple question with ideas at the moment of brushing leetcode - Sword finger offer 09 Implementing queues with two stacks
Building cloud computers with FRP
Linux下搭建集群环境(2)-----------linux下安装Mysql
Alipay payment interface sandbox environment test and integration into an SSM e-commerce project
Static keyword explanation
贴片机供料器(feeder)飞达的种类,如何工作
Unable to start debugging. Unexpected GDB output from command “-environment -cd xxx“ No such file or
Read CSV file data in tensorflow
Literature research (II): quantitative evaluation of building energy efficiency performance based on short-term energy prediction
Studio5k V28 installation and cracking_ Old bear passing by_ Sina blog
别再吃各种维生素C片了,这6种维生素C含量最高的水果
Redis之内存淘汰机制
Frequently asked questions about redis
How to configure SQL Server 2008 Manager_ Old bear passing by_ Sina blog
DNS复习
Given the parameter n, there will be n integers 1, 2, 3,... From 1 to n, n. These n arrays have n! An arrangement that lists all columns in ascending order of size and marks them one by one. Given n a
Final and static
关于运行scrapy项目时提示 ModuleNotFoundError: No module named 'pymongo‘的解决方案
Stop eating vitamin C tablets. These six fruits have the highest vitamin C content
10.4.1、數據中臺