当前位置：网站首页>Solution to network access packet loss of Tencent cloud international ECS

Solution to network access packet loss of Tencent cloud international ECS

2022-06-25 00:23:00 【87cloud】

This article mainly introduces the main reasons that may cause packet loss in ECs network access , And corresponding troubleshooting 、 resolvent . Let's talk to 87cloud Learn about the solutions to the problem of network access packet loss of Tencent cloud international ECs ：

Possible causes

The possible causes of packet loss in ECs network access are as follows ：

Caused by triggering speed limit TCP Packet loss
Caused by triggering speed limit UDP Packet loss
Trigger soft interrupt packet loss
UDP Send buffer full
UDP Receive buffer full
TCP Full connection queue full
TCP Request overflow
The number of connections has reached the maximum
iptables policy Set relevant rules

Prerequisite

Log in to the instance before locating and handling the problem , For details, see Sign in Linux example And Sign in Windows example .

Fault handling

Caused by triggering speed limit TCP Packet loss

ECS instances have multiple specifications , And different specifications have different network performance . When the bandwidth or packet volume of the instance exceeds the standard corresponding to the instance specification , It will trigger the speed limit on the platform side , Cause packet loss . The troubleshooting and handling steps are as follows ：

Check the bandwidth and packet volume of the instance .
Linux Instance executable sar -n DEV 2 Command to view bandwidth and packet size . among ,rxpck/s and txpck/s The indicator is the number of packets sent and received ,rxkB/s and txkB/s The indicator is the transceiver bandwidth .
Compare the obtained bandwidth and packet size data Example specifications , Check whether the instance specification and performance bottleneck are reached .

yes , You need to upgrade the instance specification or adjust the business volume .
no , If the instance specification is not reached, the performance bottleneck , You can go through Submit work order Further positioning processing .

Caused by triggering speed limit UDP Packet loss

Reference resources Caused by triggering speed limit TCP Packet loss step , Judge whether the packet loss is caused by the performance bottleneck of the instance specification .

yes , You need to upgrade the instance specification or adjust the business volume .
If the instance specification is not reached, the performance bottleneck , The platform may be responsible for DNS Requests for additional frequency limits cause . When the overall bandwidth or packet volume of the instance reaches the performance bottleneck of the instance specification , May trigger DNS Request speed limit and appear UDP Packet loss . It can be done by Submit work order Further positioning processing .

Trigger soft interrupt packet loss

When the operating system detects /proc/net/softnet_stat The second column of counts increases , It will be judged as “ Soft interrupt packet loss ”. When your instance triggers soft interrupt packet loss , The following steps can be used for troubleshooting and handling ：
Check to see if it's on RPS：

Turn on , Then the kernel parameters net.core.netdev_max_backlog Packet loss will be caused when it is too small , Need to be enlarged . For details of kernel parameters, see Linux Examples: introduction to common kernel parameters .
Did not open , Then check whether it is CPU Single core soft interrupt high , Result in failure to send and receive data in time . if , You can ：
Choose to open RPS, Make soft interrupt allocation more balanced .
Check whether the business process will cause uneven distribution of soft interrupts .

UDP Send buffer full

If your instance is due to UDP When packet loss is caused by insufficient buffer , Troubleshooting can be carried out through the following steps ：

Use ss -nump Command view UDP Whether the send buffer is full .
if , Then increase the kernel parameters net.core.wmem_max and net.core.wmem_default, And restart UDP Procedures to take effect . For details of kernel parameters, see Linux Examples: introduction to common kernel parameters .
If there is still packet loss , You can go through ss -nump Command to see if the send buffer is not growing as expected . At this point, you need to check whether the business code passes setsockopt Set up SO_SNDBUF. if , Please modify the code to increase SO_SNDBUF.

UDP Receive buffer full

If your instance is due to UDP When packet loss is caused by insufficient buffer , It can be handled through the following steps ：

Use ss -nump Command view UDP Whether the receive buffer is full .
if , Then increase the kernel parameters net.core.rmem_max and net.core.rmem_default, And restart UDP Procedures to take effect . For details of kernel parameters, see Linux Examples: introduction to common kernel parameters .
If there is still packet loss , You can go through ss -nump Command to see if the receive buffer is not growing as expected . At this point, you need to check whether the business code passes setsockopt Set up SO_RCVBUF. if , Please modify the code to increase SO_RCVBUF.

TCP Full connection queue full

TCP The length of the full connection queue is taken as net.core.somaxconn And business process call listen When it came in backlog Parameters , The smaller of the two . If your instance happens TCP When the full connection queue is full, resulting in packet loss , It can be handled through the following steps ：

Increase kernel parameters net.core.somaxconn. For details of kernel parameters, see Linux Examples: introduction to common kernel parameters .
Check whether the business process has passed in backlog Parameters . if , Then turn it up accordingly .

TCP Request overflow

stay TCP When receiving data , if socket By user Lock the , The data will be sent to backlog queue . If this process fails , Will cause TCP Request overflow causes packet loss . Usually , Assume that the performance of the business process is normal , You can refer to the following methods to troubleshoot and deal with problems at the system level ：

Check whether the business process passes setsockopt Set it yourself buffer size ：

If set , And the value is not large enough , You can modify the business process to specify a larger value , Or no longer pass setsockopt Specify size .
explain
setsockopt The value of is controlled by the kernel parameter net.core.rmem_max and net.core.wmem_max Limit . While adjusting business procedures , Can be adjusted synchronously net.core.rmem_max and net.core.wmem_max. After adjustment, please restart the business process to make the configuration effective .
If not set , You can turn it up net.ipv4.tcp_mem、net.ipv4.tcp_rmem and net.ipv4.tcp_wmem Kernel parameters to adjust TCP socket Water level of .
For kernel parameter modification, see Linux Examples: introduction to common kernel parameters .

The number of connections has reached the maximum

ECS instances have multiple specifications , And different specifications have different connection numbers and performance indicators . When the number of connections of the instance exceeds the standard corresponding to the instance specification , It will trigger the speed limit of the platform , Cause packet loss . The processing steps are as follows ：

explain
The number of connections refers to the number of sessions of the ECS instance saved on the host , contain TCP、UDP and ICMP. This value is greater than the value passed on the ECS instance ss or netstat Number of network connections obtained by command .

Check the number of connections of your instance , And compare Example specifications , Check whether the instance specification and performance bottleneck are reached .

yes , You need to upgrade the instance specification or adjust the business volume .
no , If the instance specification is not reached, the performance bottleneck , You can go through Submit work order Further positioning processing .

iptables policy Set relevant rules

On the cloud server iptables Without setting relevant rules , May be iptables policy The related rule settings cause the packets arriving at the ECs to be discarded . The processing steps are as follows ：

Execute the following command , see iptables policy The rules .
```
iptables -L | grep policy 
```

iptables policy The default rule is ACCEPT. if INPUT chain policy Not ACCEPT, Will cause all packets to the server to be discarded . for example , If the following results are returned , Indicates that all packets entering the ECS will be drop.

Chain INPUT (policy DROP)
Chain FORWARD (policy ACCEPT)
Chain OUTPUT (policy ACCEPT)

Execute the following command , Adjust as needed -P After the value of .
```
iptables -P INPUT ACCEPT 
```

After the adjustment , Can be executed again step 1 Command view , The following results should be returned ：

Chain INPUT (policy ACCEPT)
Chain FORWARD (policy ACCEPT)
Chain OUTPUT (policy ACCEPT)

原网站

版权声明
本文为[87cloud]所创，转载请带上原文链接，感谢
https://yzsam.com/2022/176/202206241946513192.html