当前位置：网站首页>Network Interview eight part essay of daily knowledge points (TCP, startling group phenomenon, collaborative process)

Network Interview eight part essay of daily knowledge points (TCP, startling group phenomenon, collaborative process)

2022-06-22 04:37:00 【yun6853992】

Based on continuous learning , I always feel that I can make some understanding of the following questions .

But I know , Don't do any actual sorting or testing , My understanding is always based on theory , I feel like I don't understand .

Combined with Baidu , Following these questions, I do some knowledge backup for myself according to my own understanding , If there is something wrong , Please correct me. ...

0： summary

When doing the following combing , The relevant understanding is written in the front .

1： Sorting out listen and accept, And semi connected queue and full connected queue , Organize relevant knowledge .

listen after , Start three handshakes , During the three handshakes , The kernel protocol stack maintains a half connection queue and a full connection queue .

listen Parameters of backlog, Coordinate with operating system parameters , Determines the size of the half connection queue and the full connection queue , If exceeded , To drop a connection or send rst message

accept It was three handshakes and finally received ack After completion , Take a completed connection from the full connection queue .

Insert picture description here

Reference resources ：TCP Actual combat II ( Semi connected queues 、 Full connection queue ) - Ball control obsessive-compulsive disorder - Blog Garden (cnblogs.com)

2：tcp Some useful knowledge

tcp By serial number and ack Retransmission mechanism , Realize orderly transmission .

By slow start , Congestion avoidance , Fast retransmission , Fast recovery , And sliding windows （ Send window , Restrict windows , Receiving window , Slow start threshold ）, Realize congestion control and flow control .

close_wait There are a lot of them because the server is not in time close,closing Is to call at the same time close The state that will appear .

recv The function receiving length is 0, It is the opposite end that sent close, The server should proceed as soon as possible close operation .

recv Function fails in event , Or when there is no internal data , Returns the -1,errno by again, stay et Useful knowledge points when reading pattern loops .

3： Shock group correlation

The high version of the linux The bottom layer of the operating system has dealt with Jingqun , Only the first thread will be awakened , So in general, when simulating , The phenomenon of startling the crowd will not appear .

Look at it this way , The high version of the linux Environmental Science , The surprise problem itself is dealt with , But you can't rely too much on the operating system ？

It can simulate the phenomenon of startling crowds , Increase the waiting time ...

4： remaining problems :

1：tcp Is based on streaming , that send when , If the buffer has multiple packets of data , How did you handle it ？

2：nginx Event triggered code logic , as well as accept lock , and worker Load balancing of processes , You can tidy it up .

===》 How to make accept The lock is released as soon as possible ？

3： The coordination process is based on setjmp/longjmp(state threads), be based on glibc Of ucontext Components （coroutine）, Based on assembly （libco,ntyco） And what is said on the Internet is based on switch-case To achieve （Protothreads), And one involved in the implementation process with assembly C Linguistic hook skill

4: Synergy and network io The relationship between ？ If used together ？

5: Try the following strace

1：tcp stay listen Time parameters backlog The meaning of ？

Directly in linux Environment , Use man listen View relevant descriptions , Understand it .

 #include <sys/types.h> /* See NOTES */
 #include <sys/socket.h>

 int listen(int sockfd, int backlog);
/* DESCRIPTION listen() marks the socket referred to by sockfd as a passive socket, that is, as a socket that will be used to accept incoming connection requests using accept(2). The sockfd argument is a file descriptor that refers to a socket of type SOCK_STREAM or SOCK_SEQPACKET. The backlog argument defines the maximum length to which the queue of pending connections for sockfd may grow. If a connection request arrives when the queue is full, the client may receive an error with an indication of ECONNREFUSED or, if the underlying protocol supports retransmission, the request may be ignored so that a later reattempt at connection succeeds. RETURN VALUE On success, zero is returned. On error, -1 is returned, and errno is set appropriately. .... */

Translate the above description ：

/* listen()  take  sockfd  The referenced socket is marked as passive socket , About to use  accept(2)  Socket that accepts incoming connection requests . sockfd  The parameter is a file descriptor , It refers to  SOCK_STREAM  or  SOCK_SEQPACKET  Type socket . backlog  Parameters define  sockfd  The maximum length to which the pending connection queue can grow .  If the connection request arrives when the queue is full , The client may receive a message with  ECONNREFUSED  Indicated error , perhaps , If the underlying protocol supports retransmission , Requests can be ignored , In order to retry the connection later . */

understand ：

===》listen The function is based on already （ Executed bind operation ） Bound one sockfd, thereafter , The client connects to this fd when , The kernel maintains a half connection queue and a full connection queue

===》 Client to connect When the connection , The kernel maintains these through a half connection queue connect Of clientfd（ The client sends syn, Server reply ayn_ack after ）

===》backlog Describes the size of the half connection queue and the full connection queue （ Together with the size set by the kernel, the size of the half connection queue and the full connection queue ）.

===》 If the queue is full , The client will receive ECONNREFUSED Indicated error （ This time connect Return value of ？ The server refused , reply rst Scene of time ）

=========》 When the half connection queue and the full connection queue are full , When there is a new connection , Or just throw it away , Or they will reply rst message .

===》 Re sending is supported connect Words , Will be connected later .（ The server will reply rst message , Let the client resend ）

2：accept Which step of the three handshakes ？

alike , stay linux Use... In the environment man accept Learn about this function . （ Only part of it is pasted here , dependent error The return value can be viewed by itself ）

       #include <sys/types.h> /* See NOTES */
       #include <sys/socket.h>

       int accept(int sockfd, struct sockaddr *addr, socklen_t *addrlen);

       #define _GNU_SOURCE /* See feature_test_macros(7) */
       #include <sys/socket.h>

       int accept4(int sockfd, struct sockaddr *addr, socklen_t *addrlen, int flags);

// The English description found here is directly translated 
/***************************************************  describe  accept()  System call and connection based socket type （SOCK_STREAM、SOCK_SEQPACKET） Use it together .  It is a listening socket  sockfd  Extract the first connection request in the pending connection queue , Create a new connection socket , And returns a new file descriptor that references the socket . ====>  The newly created socket is not listening . Raw socket  sockfd  Not affected by this call .  Parameters  sockfd  It's a use  socket(2)  Created socket , Use  bind(2)  Bind to local address , And in  listen(2)  Then listen for connections .  Parameters  addr  It's pointing  sockaddr  Pointer to structure . ====> As the communication layer knows , This structure is filled with the address of the peer socket . The return address  addr  The exact format of is determined by the address family of the socket （ See socket(2)  And the corresponding agreement manual page ）. ====>addr by NULL when , Don't fill in anything ; under these circumstances , Don't use  addrlen, It should also be for  NULL. addrlen  Parameter is a value result parameter ： The caller must initialize it to include  addr  The size of the structure pointed to （ In bytes ）; On return, it will contain the actual size of the peer address . ====> If the buffer provided is too small , The returned address will be truncated ; under these circumstances ,addrlen  Will return a value greater than the value supplied to the call .  If there are no pending connections in the queue , And the socket is not marked as non blocking ,accept()  Will block the caller , Until the connection appears . If the socket is marked as non blocking and there are no pending connections in the queue , be  accept() Failed with errors  EAGAIN  or  EWOULDBLOCK.  To notify incoming connections on the socket , You can use  select(2)  or  poll(2). A readable event is passed when a new connection is attempted , Then you can call  accept()  To get the socket for this connection . perhaps , You can set the socket to pass when activity occurs on the socket  SIGIO; For more information , See socket （7）.  For some protocols that require explicit validation , for example  DECNet,accept()  It can be thought of as simply dequeuing the next connection request rather than implying confirmation . A normal read or write to a new file can imply an acknowledgement descriptor , And you can hint at rejection by closing the new socket . At present, only  DECNet  stay  Linux  Have these semantics on .  If  flags  by  0, be  accept4()  And  accept()  identical . You can perform bitwise or operations on the following values in the flag to obtain different behaviors ： SOCK_NONBLOCK  Set... On the newly opened file description  O_NONBLOCK  Document status flag . Use this flag to save on  fcntl(2)  To achieve the same result . SOCK_CLOEXEC  Set... On the new file descriptor  close-on-exec (FD_CLOEXEC)  sign . see also  open(2)  Chinese vs  O_CLOEXEC  Description of the logo , Understand why this might be useful .  Return value   success , These system calls return a nonnegative integer , It is the descriptor of the accepted socket . When something goes wrong , return  -1, And set appropriately  errno.  Error handling  Linux accept()（ and  accept4()） Take the network error that has been suspended on the new socket as the error from  accept()  Error code passing . This behavior is different from others  BSD  Socket implementation . For reliable operation , The application should detect network errors   stay  accept()  The protocols are then defined and treated as... By retrying  EAGAIN. about  TCP/IP, They are  ENETDOWN、EPROTO、ENOPROTOOPT、EHOSTDOWN、ENONET、EHOSTUNREACH、EOPNOTSUPP  and  ENETUNREACH. *******************************/

understand ：

===》listen After function execution , Aiming at this fd, The kernel maintains a half connection queue and a full connection queue （ Describes the process of three handshakes , Client connection ）

===》accept It is based on the fact that bind,listen After the connection fd( This server fd Internally, a half connection queue and a full connection queue will be maintained ) operational .（ Take a connected... From the full connection queue ）

===》accept From the full connection queue （ Completed connection completion queue ） Pick up a , Create a new fd.

===》accept The function parameter can get the connection to the client ip Port related information ,

===》accept If the process is non blocking fd, When there is no link , return -1, There is an error EAGAIN or EWOULDBLOCK

===》accept Can cooperate with io Multiplexing （select,poll,epoll） Use

===》accept4 It can be set directly fd Related configuration , and accept Same function , An extended function .

3：tcp and udp The difference between ？

3.1：tcp It's connection-oriented , It must be connected one-to-one （ Three handshakes, four waves ）,udp When there is no connection , Direct transmission , One to many

3.2：tcp adopt “ack+ Repeat send ” The mechanism of Realize the reliable arrival of data , It must have an effect on the speed ,udp Just send , It is easy to cause network congestion .

===》tcp how Control reliable arrival and transmission rate ？（ adopt Network congestion control （ Slow start , Congestion avoidance , Fast retransmission , Fast recovery ）, flow control （ The sliding window （ Receiving window , Congestion window , Send window , The slow start threshold works together ）））

======》 Slow start and congestion avoidance through Slow start threshold , Control the size of the sending window

======》 Fast retransmission ： Received a confirmation three times in a row ack, The fast retransmission

======》 Fast recovery ： Change the congestion window value 、 The sending window value is half of the original value , Continue to control through slow start and congestion avoidance logic .

======》 besides ：tcp And some timers , Connect timer , Retransmission timer (RTO), Stick to the timer (persist timer), Delay timer (Delayed ACK), Life keeping timer (Keep Alive),FIN_WAIT_2 Timer ,TIME_WAIT Timer

3.3：tcp It's for Byte stream Transmission control protocol ,udp Is a message oriented datagram protocol .

Byte stream oriented ： namely , Data received by the protocol stack , All is still in the cache , We go through recv Fetch data from cache , I don't know where the boundary is , Upper layer protocol control is required .

===》tcp Ensure the order of these packages , Reliable , Stream into the buffer .

For datagram ：recvfrom when , Each time data is fetched from the cache, it is fetched one by one according to the received message , The underlying protocol stack has boundary processing , If recvfrom Not all at once , Next time, it is also taken from the next message , What you didn't get was lost .

===》udp Data reliability is not guaranteed , You may lose your bag here , It may be disorderly .

3.4：tcp Guaranteed reliability （ Sacrifice speed ）,udp Guaranteed speed （ unreliable ）

TCP Suitable for low transmission efficiency , But the application scenario with high accuracy , Like the World Wide Web (HTTP)、 File transfer (FTP)、 E-mail (SMTP) etc. .

udp It is suitable for high transmission efficiency , But application scenarios with low accuracy requirements , Such as domain name conversion (DNS)（ Packet size , use udp It has little impact on the network ）、 Remote file server (NFS) etc. .

Leave a thought ：tcp The receive buffer is understandable ,（ Generally, there should be no such phenomenon ！） that tcp Have to send buffer , If the buffer caches too much data ,tcp How to distinguish the boundaries of these cached packages ？ Or treat it as a stream ？

4： A lot of close_wait Why

4.1： Four waves to understand close_wait The reason for this

commonly tcp Disconnection of connection , It is initiated by the client .

As can be seen from the figure below ,colse_wait The status is due to the active disconnection of the client , After the server replies to the client , Delay in actively calling close Initiating a request to disconnect results in .
Insert picture description here

4.2：close_wait Code analysis of

Here is a point of knowledge , We are realizing tcp Server code , use io Multiplexing is fine ：

===》 Client calls close when , The server will trigger recv The receiving length is 0;

===》 When using et Mode to read data circularly , Read the last , return -1, Error code errno yes eagain.

analysis ： When we recv The return value is 0 when , This indicates that the client initiates a shutdown request , We The connection should be closed on the server as soon as possible perform close Trigger the active shutdown process of the server .

===》 If there is no closing or blocking action , There will inevitably be a large number of close_wait state

5：closing The reason for this

Find one first tcp Take a look at the state transition diagram closing state ：
Insert picture description here

As you can see from the diagram ,closing When the state appears , When the state is FIN_WAIT_1 when , Received... From the opposite end FIN message , Instead of what is normally expected to have been sent FIN reply ACK message ,

This scenario shows that both ends are sent at the same time FIN request , At the same time close Scene （ send out fin I didn't receive ack message , But received from the opposite end fin Text ）

When receiving the opposite ack after , It will become time_wait state （ Make sure your ack Received on the other end ）, Know and become closed state .

6：eagain Why （et The mode can be as follows EAGAIN Judge whether you have finished reading ）

// stay linux Execution in environment man recv View the analysis description of the return value .
EAGAIN  or  EWOULDBLOCK
     The socket is marked as non blocking and the receive operation will block , Or the receive timeout has been set and the timeout has expired before receiving data . POSIX.1  Allows any error to be returned in this case , And these configurations are not required stants  Has the same value , Therefore, portable applications should examine these two possibilities .

Personal understanding ：

===》 in other words , For non blocking fd, During the actual reception processing , This data has expired / The timeout is invalid （ Was dealt with by others ）

scene ：

===》epoll Of lt In mode , Multithreading at the same time fd To deal with , There will be eagain
===》epoll Of et In mode , When the event triggers , Loop read data , Until the data is read -1 when , It would be this errno.

Turned out to be tcp stay recv when , No data will be returned -1,errno by eagain

The return value is 0 Indicates that the client is disconnected

7：tcp How to ensure the order

tcp A serial number field in the protocol header , By this serial number and ack Mechanism , Guarantee tcp Reliable sequential reception of .

8：epoll How to solve the problem ？

8.1： What is a surprise group

Simulation reference of group startling phenomenon ：Linux Network programming “ Jing group ” The problem summary - Rabbit_Dale - Blog Garden (cnblogs.com)

Jingqun is aimed at servers ,

===》 As a server , A large number of connections are often considered , High concurrency scenario , Need to design a specific network model （ Such as nginx Multi process of ）

===》 In a multiprocess or multithreaded network model , If not handled properly , There will be a phenomenon of surprise .

===》 for example ： As a tcp Server for , Design multiprocess pairs accept,recv and send To deal with , Multiple processes jointly manage one epoll, When the event triggers , All processes will respond , But the actual handling of this event , Only one process will use epoll_wait Data fetching succeeded

Use strace Combine multithreading to listen to a epoll Scene , Demonstrate the phenomenon of startling the crowd ：

// for example   As tcp The service side , The main thread creates ten processes , Yes epoll Monitor processing .
//  Be careful   I did not test successfully with multithreading , Because between multiple threads of a process , In fact, it is scheduled and executed in a certain order , ===》 It's a guarantee epoll_wait Timely extraction of , As well as the operating system itself, it has already handled the surprise group ？？  But similar multi process , In multi-threaded epoll_wait Add a sleep(), It should also be visible .

#include <stdio.h>
#include <stdlib.h>
#include <string.h>

#include <sys/socket.h>
#include <arpa/inet.h>

#include <sys/epoll.h>

#include <fcntl.h>
#include <unistd.h> 
#include <errno.h>

#include <unistd.h> //pid_t fork(void);
#include <sys/types.h>
#include <sys/wait.h> //wait

#define VPS_PORT 9999

void fork_test();
// The logic of the surprise test 
int fork_test_shocking_herd(); 

int main()
{
    
	// First, test the process creation 
	//fork_test(); // Don't test together 
	// The logic of the surprise test 
	fork_test_shocking_herd(); 
	return 0;
}

void fork_test()
{
    
	// Try creating ten processes 
	pid_t pid;
	for(int i=0; i<10; i++)
	{
    
		pid = fork();
		if(pid == 0)
		{
    
			printf("%d : fork() success [%d][%d] \n",i, getpid(), getppid());
			break;
		}

		if(pid >0)
		{
    
			wait(NULL); // This should be at the end   Otherwise, wait until a child process is finished before the next one 
			printf("%d: fork() parent pid is[%d] \n", i, getppid());
		}

		if(pid < 0)
		{
    
			printf("%d: fork() error \n\n", i);
		}
	}
	printf("my pid success is[%d][%d] \n", getpid(), getppid());
	return;
}

int vps_init_socket();
int vps_create_epoll_and_addfd(int listenfd);
int vps_epoll_wait_do_cycle(int epfd, int listenfd);
int fork_test_shocking_herd()
{
    
	// establish epoll,  establish listendfd
	int fd = vps_init_socket();
	if(fd < 0)
	{
    
		printf("create vps socket fd error. \n");
		return -1;
	}else
	{
    
		printf("create vps socket fd:%d success. \n",fd);
	}
	int epfd = -1;
	// establish epoll, And add fd, return epoll
	epfd = vps_create_epoll_and_addfd(fd);
	if(epfd == -1)
	{
    
		printf("create epoll fd error.\n");
		close(fd);
		return -1;
	}
	printf("\n\n create epoll and listenfd success :[%d:%d] \n", epfd, fd);

	// Create ten processes , Yes epoll monitor 
	pid_t pid;
	for(int i=0; i<10;i++)
	{
    
		pid = fork();
		if(pid == 0)
		{
    
			// Here to epoll Deal with the incident  epoll_wait Handle 
			printf("%d : fork() success [%d][%d] \n",i, getpid(), getppid());
			vps_epoll_wait_do_cycle(epfd, fd);
			break;
		}

		if(pid < 0)
		{
    
			printf("fork() error \n\n");
		}
	}
	// The main process wait Waiting for the end of the process 
	if(pid > 0)
	{
    
		int status;
		wait(&status);
		close(fd);
		close(epfd);
	}
	
	return 0;
}
// Set up fd Non blocking   By default  fd It's blocked 
int SetNonblock(int fd) {
    
	int flags;
	flags = fcntl(fd, F_GETFL, 0);
	if (flags < 0)
		return flags;
	flags |= O_NONBLOCK;
	if (fcntl(fd, F_SETFL, flags) < 0) 
		return -1;
	return 0;
}

// establish   Server side socket, there ip and port It's dead 
int vps_init_socket()
{
    
	int fd = socket(AF_INET, SOCK_STREAM, 0);
	if(fd < 0)
	{
    
		printf("create socket error. \n");
		return -1;
	}
	// Set up fd Non blocking   Set ports to be reusable 
	SetNonblock(fd);
	int optval = 1;
	setsockopt(fd, SOL_SOCKET, SO_REUSEADDR, &optval, sizeof(int));

	// Definition fd Related parameters 
	struct sockaddr_in server_addr;
	memset(&server_addr, 0, sizeof(struct sockaddr_in));
	server_addr.sin_family = AF_INET;
	server_addr.sin_port = htons(VPS_PORT);
	server_addr.sin_addr.s_addr = htonl(INADDR_ANY);

	if(bind(fd, (struct sockaddr*)&server_addr, sizeof(struct sockaddr)) < 0)
	{
    
		printf("vps socket bind error \n");
		return -1;
	}

	// Set up fd For passive sockets   for accept use   Set up listen The size of the queue  
	if(listen(fd , 20) < 0)
	{
    
		printf("vps socket listen error \n");
		return -1;
	}

	printf("create and set up socket success. start accept.. \n");
	return fd;
}

int vps_create_epoll_and_addfd(int listenfd)
{
    
	// establish epoll
	int epfd = -1;
	epfd = epoll_create(1); // Parameter ignored must be greater than 0
	if(epfd == -1)
	{
    
		printf("create vsp epoll error. \n");
		return -1;
	}
	//epoll_ctl Join a node 
	struct epoll_event event;
	event.data.fd = listenfd;
	event.events = EPOLLIN | EPOLLET;  // Monitor access   use ET

	if(epoll_ctl(epfd, EPOLL_CTL_ADD, listenfd, &event) == -1)
	{
    
		printf("vps epoll add listenfd error. \n");
		close(epfd);
		return -1;
	}
	printf("vps epoll create success and add listenfd success.[%d] \n", epfd);
	return epfd;
}

//  Events trigger   Processing connection requests 
int vps_accept_exec(int epfd, int listenfd)
{
    
	// There is a link   need epoll receive  epoll_ctl Add listening for readable Events 
	struct  sockaddr_in cliaddr;
	socklen_t clilen = sizeof(struct sockaddr_in);

	//et Pattern   Take out all the connections 
	int clifd = -1;
	int ret = 0;
	int success_time = 0;
	while(clifd = accept(listenfd, (struct sockaddr *)&cliaddr, &clilen))
	{
    
		//accept  Returns a nonnegative integer normally   Return... On error -1  This debug Debug it 
		if(clifd == -1)
		{
    
			// The resource is temporarily unavailable   You should try again   But you should not retry indefinitely 
			if (((errno == EAGAIN) || (errno == EWOULDBLOCK) )&& ret <3) 
			{
    
				ret++;
				continue;
			}
			printf(" accept error: [%s]\n", strerror(errno));
			return -1;
		}
		// For those already connected fd To deal with   Should join epoll
		SetNonblock(clifd);
		// Join in epoll
		struct epoll_event clifd_event;
		clifd_event.data.fd = clifd;
		clifd_event.events = EPOLLIN | EPOLLET; //ET The mode should be read circularly 
		if(epoll_ctl(epfd, EPOLL_CTL_ADD, clifd, &clifd_event) == -1)
		{
    
			printf("vps accetp epoll ctl error . \n");
			close(clifd);
			return -1;
		}
		success_time++;
		printf("accept success. [%d:%s:%d] connected \n",clifd, inet_ntoa(cliaddr.sin_addr), ntohs(cliaddr.sin_port));
	}
	if(success_time == 0)
	{
    
		printf("\n\n accept error [%d:%d] \n\n",getpid(), getppid());
	}
	return 0;
}

//  Events trigger   Processing readable requests   Reading data   There is no monitor to write here ,
int vps_recv_exec(int epfd, int connfd)
{
    
	// Here is the real business process , Receive data and actively send a return data .
	// If there's data   Receive   Until the reception is finished , Close the connection 
	printf("start recv data from client [%d].",connfd);
	// Business scenarios are not frequent here   The client terminates every time it sends ？
	// Try to let the client actively disconnect ,
	// You can implement a timer by yourself , Detect active disconnect processing 
	char recv_data[1024] = {
    0};
	int datalen = -1;
	// There may be a signal interruption   The receiving length is -1 Scene 
	while(1){
    
		// Can't take  ==0 Add here   Otherwise, it will loop when the client is disconnected 
		while((datalen = read(connfd,  recv_data,  1024)) > 0 )
		{
    
			printf("recv from [%d] data len[%d], data[%s] \n", connfd, datalen, recv_data);
			memset(recv_data, 0, 1024);
		}

		// Shut down on the client   When disconnected   The receiving length is 0
		printf("recv from [fd:%d] end \n", connfd);

		// Send a reply message to the received message   Here you can save some fd And the client ip and port Correlation , Construct reply message 
		const char * send_data = "hi i have recv your msg \n";
		if(strlen(send_data) ==  write(connfd, send_data, strlen(send_data)))
		{
    
			printf("send buff succes [len:%lu]%s", strlen(send_data), send_data);
		}

		// The server receives empty packets because the client is shut down , The corresponding... Should be closed fd And from epoll Remove 
		if(datalen == 0)
		{
    
			if(epoll_ctl(epfd, EPOLL_CTL_DEL, connfd, 0) == -1)
			{
    
				printf("vps [fd:%d] close ,remove from epoll event error\n", connfd);
			}else
			{
    
				printf("vps [fd:%d] close ,remove from epoll event success\n", connfd);
				close(connfd);
			}
			
			break;
		}


		// be equal to 0  It may be the end of the reading 
		if(datalen == -1)
		{
    
			printf("recv end error: [%s]\n", strerror(errno));// Inevitable trigger   Has received 
			if (errno == EWOULDBLOCK && errno == EINTR) // Don't deal with it 
			{
    
				continue;
			}
			// Do you want to remove this fd Well ？  Treat as removed  tcp Short connection 
			// if(epoll_ctl(epfd, EPOLL_CTL_DEL, connfd, 0) == -1)
			// {
    
			// printf("vps client [%d] remove from epoll error\n", connectfd);
			// }else
			// {
    
			// printf("vps client [%d] remove from epoll success\n", connectfd);
			// }
			
			// close(connfd);
			break;
		}
	}
	
	return 0;
}

// Use epoll_wait Yes epfd monitor   And then business processing  
int vps_epoll_wait_do_cycle(int epfd, int listenfd)
{
    
	struct epoll_event event_wait[1024];
	int nready = 0;

	while(1) // If multithreading   You should set the termination flag here 
	{
    
		//int epoll_wait(int epfd, struct epoll_event *events, int maxevents, int timeout);
		nready = epoll_wait(epfd, event_wait, 1024, -1);
		sleep(2);
		printf(" \n\n i have get the event [%d:%d] \n", getpid(),getppid());
		if(nready < 0)
		{
    
			if (errno == EINTR)//  The signal is interrupted 
    		{
    
	    		printf("vps epoll_wait return and errno is EINTR \n");
                continue;
    		}
            printf("vps epoll_wait error.[%s]\n", strerror(errno));
            break;
		}

		if(nready == 0)
		{
    
			continue;
		}

		// There are already related events triggered here   Conduct business processing 
		for(int i = 0; i<nready; i++)
		{
    
			// Process readable , distinguish listenfd
			if(event_wait[i].events & EPOLLIN)
			{
    
				if(event_wait[i].data.fd == listenfd)
				{
    
					// Handle accept  It should listen and read here   No listening and writing 
					vps_accept_exec(epfd, event_wait[i].data.fd);
				}else
				{
    
					// Handle recv,  It is possible that the opposite end is actively closed ,
					vps_recv_exec(epfd, event_wait[i].data.fd); 
				}
			}

			// In this case, we should start from epoll Remove , And shut down fd
    		// If it is not the business that the client terminates after sending it , Are we not del, There are only exceptions del
    		if (event_wait[i].events & (EPOLLERR | EPOLLHUP)) //EPOLLHUP  Have finished reading 
    		{
    
    			printf("epoll error [EPOLLERR | EPOLLHUP].\n");
    			epoll_ctl(epfd, EPOLL_CTL_DEL, event_wait[i].data.fd, NULL);
    			close(event_wait[i].data.fd);
    		}
		}
	} 
	return 0;
}

The phenomenon ：

When using multiple processes to simulate a crowd , I found that I could not see the phenomenon , It is the current version linux Jing Qun has been dealt with , It just wakes up a process .

stay epoll_wait Add one later sleep（2）, You can see the phenomenon of startling the crowd ...

strace ./xxx ===> As you can see from the figure, ten processes have been created , One request , In fact, it wakes up multiple processes .

// Be careful ： The reason why the crowd was shocked was that the incident was not taken in time   An event , Wake up multiple processes to process at the same time , But only one process is needed 
// The operating system has been circumvented , Here's a simulation , Here we add sleep simulation 
	while(1) 
	{
    
		nready = epoll_wait(epfd, event_wait, 1024, -1);
		sleep(2);  // Be careful  
		printf(" \n\n i have get the event [%d:%d] \n", getpid(),getppid());
		...
	}

Insert picture description here

8.2： How to deal with the shock

The root cause of the surprise is ： Multiple threads process the same event .

Solutions ： Or do you think about competing for resources ,（ Ensure that only one thread handles this event ）

8.2.1： Think about solutions

====》 A process goes on accept Handle , Other processes handle other connected fd(recv and send),( Yes accept Processing is thrown to a thread / coroutines / Processes are dedicated to receiving and sending ) (memcached Handle )

========》 Here to accept New fd To the worker thread , The workload is not guaranteed ,accept Both threads and worker threads maintain their own event management .

====》 similar nginx Multi process network model , Use accept lock , Lock the read event , Ensure that only one process can handle this event .

========》accept A mutex is actually a cross process lock , In a memory shared by all processes .

========》 All the processes , Preempt the mutex first , The process of preemption accept Connect , And then read it , analysis , Processing requests ,（ A request is handled entirely by one process , And only in one process ） It is necessary to consider timely release accept lock , Deal with some events later .

8.2.2：nginx accept lock

describe ： Individual to nginx This source code is not fully understood , But based on theory, it may be about accept The lock is understood in this way （ After all, I didn't study the code , It may not be reliable ）

=======》 Turn on accept lock , To handle incoming connections , The phenomenon that multiple processes are dealing with .

=======》nginx More as a web The server , be based on http A request for , A reply scenario . Then the scenario about the business , Use accept lock , Extract the event that has been triggered , Make new connection and receive processing .

=======》nginx Inside each worker Load balancing management of processes , Is controlled by a variable , Of the total connections 7/8 （ The total number of links should take effect according to the operating system settings and configurations , Performance and memory related ）

=======》 On the basis of load balancing ,nginx worker The process will try to get accept lock , After getting the lock , New connection processing and original event processing will be performed , To release the lock as soon as possible , When a new connection is received, it is directly put into a queue and the lock is released （ Here you may want to know the source code ..., There are two queues , Connect queues and events queue ）

Be careful ： Here's a detail , How to ensure accept Quick release of lock ,nginx How to implement this logic ？

nginx The benefits of multiprocessing ： Bound to the core , decoupling （ The main process and each work process perform their respective duties ）, Each work process does not affect each other , There is no need to lock between individual processes .

9： Why is there a synergy ？

Reference resources ： With multithreading , Why is there a collaborative process ？ - You know (zhihu.com)

9.1： Process and thread scheduling overhead , Unable to control switching order , Memory consumption

The memory space of the process is independent , Switching between multiple processes is expensive cpu Of , And interprocess communication is also troublesome , But the process security is high , A process crashes , Will not affect other processes .

Threads have their own independent stack , But you can also access each address space of the process , The kernel controls fast switching . Threads are not safe , A thread crash can lead to a process crash .

====》 Threads allocate memory , The number of threads that a process can allocate is limited .

Whether it's a process or a thread , In fact, the bottom layer is made up of cpu Scheduling management , There is scheduling overhead , And the scheduling sequence is controlled by the bottom layer .

9.2： coroutines ： User mode control switching

Frequent switching of threads consumes cpu, Switching can be managed at the user level . ===》 province cpu

Multithreading allocates fixed memory for each thread to ensure the running of the thread , Coroutines share the memory of threads , It only schedules and manages the related running programs . ===》 Provincial memory , Memory no longer constrains us

Cooperate to develop , Asynchronous performance , Synchronous programming , We better understand the business ===》 Easy to understand the process

besides , It is difficult to share memory among multiple processes , There is no protection between multiple threads （ Cause the process to crash ）, You can also avoid . ==》 Stable

9.3： The implementation scheme of the collaborative process and the open source library

9.3.1： General idea of the collaboration Library （ Pure memory , It needs to be sorted out later ）

It is always said that a coroutine is a lightweight thread managed by the user layer , The principle of realizing coprocessing , That's right Register stack for program operation , Parameters, etc. Preservation , And schedule the next cooperation process in waiting .

（ The details may not be clear , In the later stage, learn a collaboration library and then ）

====》 The key points of realizing the cooperation process , Is to define the storage related register stack , Parameters and other program running context information structure

====》 Manage multiple coroutine context stacks , parameter information , Perform scheduling switching （yeild Give up cpu, resume Resume operation ）

====》 The point is resume and yeild The implementation logic of .

9.3.2： Co process implementation scheme 、 Collaborative open source library

There are many languages that already support coroutines ,c/c++ There are some collaboration Libraries .

The key point to realize the cooperation process is resume Functions and yeild Implementation of function , It mainly involves switching register information , Known , It consists of the following schemes ：

====》1： be based on setjmp/longjmp Realization , （state threads library ,c Language implementation ,srs The collaboration library is used in the streaming media server ）

====》2： be based on glibc Of ucontext Components Realization （coroutine）

====》3： be based on assembly Realization （libco,ntyco Is based on the assembly implementation of the library ）

==》 Online said , utilize C Language grammar switch-case To achieve （Protothreads), This can be seen later ... （ One “ flyweight ” C Language library | cool shell - CoolShell）

I used to contact a project a long time ago and used the collaboration process , It is implemented in assembly language , And used c In the language hook,,, This is to be sorted out .

I've been in contact with coroutine,libco, But when I first started working , This can be reviewed .

10： Synergy and network io The relationship between ？

===》 Unclear , Combine Baidu to understand ,

Switch of coroutine , The user layer switches separately , So how to choose the switching order ？ Save state + Switch

The Internet io： We all know that we can use io Multiplexing , Event monitoring on the network io To deal with .

coroutines ： Save state , Manages the switching of multiple processes .

The Internet io： Can be triggered by events , Do the corresponding business processing

===》 Combination of the two , Network triggered by events io, Drive the scheduling switching of the coordination process , Whether high concurrency can be achieved ？

I started trying to accumulate some common code , The code involved in this article ： Spare in your own code base

More of my knowledge comes from here , I recommend you understand ：Linux,Nginx,ZeroMQ,MySQL,Redis,fastdfs,MongoDB,ZK, Streaming media ,CDN,P2P,K8S,Docker,TCP/IP, coroutines ,DPDK Etc , Learn now

原网站

版权声明
本文为[yun6853992]所创，转载请带上原文链接，感谢
https://yzsam.com/2022/02/202202221107404083.html