当前位置:网站首页>Deep understanding of zero copy technology

Deep understanding of zero copy technology

2022-06-21 16:12:00 morpheusWB

Preface

Zero copy technology refers to when a computer performs an operation ,CPU There is no need to copy data from one memory to another . This technology is usually used to save time when transferring files over the network CPU Cycle and memory bandwidth .

Original network request , You need to switch between user mode and kernel mode and copy data several times , This undoubtedly greatly affects the efficiency of processing , Zero copy technology is born to solve this problem .

Our common high-performance components (Netty、Kafka etc. ), Zero copy is basically applied inside , Before learning these components , It is necessary to understand what zero copy is first .

Traditional file transfer read + write

DMA Copy : It means that the external equipment does not pass CPU And the interface technology that exchanges data directly with system memory

As shown in the figure above , Traditional network transmission , Need to carry out 4 Switch between secondary user mode and kernel mode ,4 Secondary data copy (2 Time CPU Copy ,2 Time DMA Copy )

Context switching involves the operating system , relative CPU Speed is very time consuming , And only one file transfer , It's necessary 4 Secondary data copy , cause CPU A great waste of resources

It's not hard to see. , Traditional network transmission involves many redundant and meaningless operations , This results in high concurrency , Performance index drop , Very bad performance

To solve this problem , Zero copy technology was born , He is actually an abstract concept , But its essence is achieved by reducing the number of context switches and data copies

mmap + write

As shown in the figure above ,mmap Technology transfer files , Need to carry out 4 Switch between secondary user mode and kernel mode ,3 Secondary data copy (1 Time CPU Copy 、 two DMA Copy )

Compared with traditional data transmission ,mmap One less CPU Copy , The specific process is as follows :

  1. Application process call mmap() ,DMA Will copy the disk data to the kernel buffer , Application process and operating system kernel 「 share 」 This buffer
  2. The application process calls again write(), The operating system directly copies the data from the kernel buffer to the socket Buffer zone , All this happens in kernel state , from CPU To carry data
  3. Last , Put the kernel of socket Data in the buffer , Copy to the network card buffer , This process is driven by DMA Carrying

Obviously, only one copy of data is reduced , Still difficult to meet the requirements

sendfile

As shown in the picture above ,sendfile Technology transfer files , Need to carry out 2 Switch between secondary user mode and kernel mode ,3 Secondary data copy (1 Time CPU Copy 、 two DMA Copy )

be relative to mmap, It also reduces two context switches , The specific process is as follows :

  1. Application call sendfile Interface , Pass in the file descriptor , The application switches to kernel mode , And pass DMA Copy the data on the disk to the kernel buffer
  2. CPU Copy buffer data to Socket buffer
  3. DMA Copy the data to the buffer of the network card , The application switches to user mode

sendfile In fact, the original two-step read-write operation is merged , This reduces 2 Sub context switching , But it is still not true “ zero ” Copy

sendfile + SG-DMA

from Linux kernel  2.4  The version starts , For support network card support SG-DMA In the case of Technology , sendfile()  The process of system call has changed a little , As shown in the figure above ,sendfile + SG-DMA Technology transfer files , Need to carry out 2 Switch between secondary user mode and kernel mode ,2 Secondary data copy (1 Time DMA Copy ,1 Time SG-DMA Copy )

The specific process is as follows :

  1. adopt DMA Copy the data on the disk to the kernel buffer ;
  2. Buffer descriptor and data length are passed to socket buffer , This network card of SG-DMA The controller can directly copy the data in the kernel cache to the network card buffer , This process does not require copying data from the operating system kernel buffer to socket Buffer zone , This reduces one copy of the data ;

This way compares with the previous , In the real sense, except CPU Copy ,CPU The cache is no longer tainted ,CPU You can perform other business computing tasks , At the same time with DMA Of I/O The task parallel , Greatly improve system performance .

But his disadvantage is also obvious , Strongly dependent on hardware support

splice

Linux stay 2.6.17 Version to introduce splice system call , Hardware support is no longer required , At the same time, it also realizes the zero copy of data between two file descriptors .

splice System calls can be made in the read buffer of kernel space (read buffer) And network buffers (socket buffer) Build a pipeline between (pipeline), This avoids user buffers and Socket Buffer CPU Copy operation .

be based on splice Zero copy of system call , The whole copy process happens  2 Switch between secondary user mode and kernel mode ,2 Secondary data copy (2 Time DMA Copy ), The specific process is as follows :

  1. The user process passes through splice() Function to kernel (kernel) Make a system call , Context from user state (user space) Switch to kernel state (kernel space).
  2. CPU utilize DMA The controller copies data from main memory or hard disk to kernel space (kernel space) Read buffer for (read buffer).
  3. CPU Read buffer in kernel space (read buffer) And network buffers (socket buffer) Build a pipeline between (pipeline).
  4. CPU utilize DMA The controller takes data from the network buffer (socket buffer) Copy to network card for data transmission .
  5. Context from kernel state (kernel space) Switch back to user mode (user space),splice System call execution returns .

splice The copy mode also has the problem that the user program can't modify the data . besides , It has been used. Linux The pipeline buffer mechanism , It can be used to transfer data in any two file descriptors , But one of its two file descriptor parameters must be a pipeline device

summary

This article briefly introduces Linux Several of the Zero-copy technology , With the continuous development of Technology , Something like : When writing copy 、 Shared buffer and other technologies , This article will not go over .

Broadly speaking ,Linux Of Zero-copy Technology can be classified into three categories :

  • Reduce or even avoid data copy between user space and kernel space : In some scenarios , The user process does not need to access and process data in the process of data transmission , So the data is in Linux Of  Page Cache  The transfer between the buffer and the user process can be avoided , Make the data copy completely in the kernel , You can even avoid copying data in the kernel in a more subtle way . This kind of implementation is usually accomplished by adding new system calls , such as Linux Medium mmap(),sendfile() as well as splice() etc. .
  • Bypass the kernel directly I/O: Allows processes in user mode to bypass the kernel and transfer data directly to the hardware , The kernel is only responsible for some management and auxiliary work in the transmission process . This is actually a little bit like the first , It's also an attempt to avoid data transfer between user space and kernel space , Only the first way is to put the data transmission process in the kernel mode , This way is to bypass the communication between the kernel and the hardware directly , The effect is similar, but the principle is completely different .
  • Transfer optimization between kernel buffer and user buffer : This approach focuses on the buffer between the user process and the operating system's page cache CPU Copy optimization . This method continues the traditional way of communication , But more flexible .
原网站

版权声明
本文为[morpheusWB]所创,转载请带上原文链接,感谢
https://yzsam.com/2022/172/202206211409254724.html