当前位置：网站首页>Understanding of CUDA, cudnn and tensorrt

Understanding of CUDA, cudnn and tensorrt

2022-06-28 08:15:00 【The mountain of ignorance, the valley of despair, the slope of 】

cuda Reference resources ：https://www.zhihu.com/question/409350643/answer/1361111350

cuda

cuda yes Compute Unified Device Architecture Abbreviation . It is called unified computing architecture in Chinese . It's to make nvidia gpu An integration technology that can perform general-purpose computing tasks . We can usually use cuda Framework has c,c++,fortran,python,java Of , It can provide a good acceleration function for the work of large data throughput . In a nutshell , Just to make GPU You can not only work with your own scenes , But to use their own advantages , Complete the task of general computing . It is mainly used in addition to daily video coding and decoding , Out of game , It can be applied to computing acceleration . Take the planetary model simulation I've been in contact with ,GPU Acceleration can greatly accelerate the physical computing process we simulate , Accelerate scientific research output .

cuda and cudnn

First ,CUDA yes C Language in GPU Programming expansion package ,CUDNN Is a library that encapsulates convolution and other operators , It's not a level thing . secondly , The relationship between the two ,CUDA Can be used to implement cudnn Various interfaces defined , In the early CUDNN It should be used internally CUDA Realized , But with the development of NVIDIA software Ecology ,CUDNN The team will definitely choose to use the lower level , Closer to hardware , More difficult tools to build Kernel, such as PTX, For example, write assembly directly （SASS）. If you don't believe it, you can try it yourself CUDA Realization CUDNN The interface of , See how poor the performance can be . Of course you can CUDA Everyone who writes well must know CUDA Limitations . Last , The position of the two in the ecology . In the beginning CUDA It can be said that it is something NVIDIA uses to fight the world , To a large extent, it has established its position in high-performance computing, especially neural network high-performance computing . because CUDA In the contradiction between exposing hardware features and maintaining software commonality, we found a delicate , The balance that most people can accept . But with the development of technology in recent years , Things have changed again ,CUDA Still shouldering the important task of software ecological universality , And high-performance tasks , More needs to be done by CUDNN,CUBLAS These high-performance software libraries undertake . In NVIDIA's vision , Mature operators , Like convolution , Such as full connection , Users can use the library to get the best performance directly , For new operators or operators unique to each user , Users can still use CUDA It is relatively easy to implement a version with acceptable performance by yourself . Finally through TensorRT, TensorFlow Such a framework links the two .

cuda、cudnn and tensorrt The relationship between

CUDA yes NVIDIA Launched for home GPU The framework of parallel computing , in other words CUDA Only in NVIDIA Of GPU Up operation , And only when the computing problem to be solved is a large number of parallel computing can play CUDA The role of .CUDA Its main function is to connect GPU and Applications , It is convenient for users to pass CUDA Of API Dispatch GPU Calculate .

cuDNN（CUDA Deep Neural Network library）： yes NVIDIA The acceleration library for deep neural network is built , It's a deep neural network GPU Acceleration Library . It can optimize the calculation of model training , Re pass CUDA call GPU Carry out operations .

Of course, you can also use it directly CUDA, Not through cuDNN , But the computational efficiency will be much lower . Because your model training calculation is not optimized .

TensorRT It's an acceleration package made by NVIDIA for its own platform , Only responsible for the reasoning of the model （inference） The process , Generally do not use TensorRT To train the model , It is used to accelerate the running speed of the model during deployment .

TensorRT Two things have been done , To speed up the model .
1、TensorRT Support INT8 and FP16 The calculation of . Deep learning network in training , Usually use 32 Bit or 16 Bit data .TensorRT In the reasoning of the network, the accuracy is not so high , Achieve the purpose of accelerating inference .
2、 TensorRT The network structure is reconstructed , Combine some operations that can be combined , in the light of GPU The characteristics of are optimized . Most deep learning frameworks are not targeted at GPU Performance optimization , And NVIDIA ,GPU Producers and porters , Naturally, it is launched for itself GPU Acceleration tool TensorRT. A deep learning model , Without optimization , For example, a convoluted layer 、 A bias layer and a reload layer , These three layers need to be called three times cuDNN Corresponding API, But in fact, the implementation of these three layers can be combined ,TensorRT Will merge some networks that can be merged .

原网站

版权声明
本文为[The mountain of ignorance, the valley of despair, the slope of ]所创，转载请带上原文链接，感谢
https://yzsam.com/2022/179/202206280805466094.html