当前位置:网站首页>Solution to network access packet loss of Tencent cloud international ECS

Solution to network access packet loss of Tencent cloud international ECS

2022-06-25 00:23:00 87cloud

This article mainly introduces the main reasons that may cause packet loss in ECs network access , And corresponding troubleshooting 、 resolvent . Let's talk to 87cloud Learn about the solutions to the problem of network access packet loss of Tencent cloud international ECs :

Possible causes

The possible causes of packet loss in ECs network access are as follows :

  • Caused by triggering speed limit TCP Packet loss
  • Caused by triggering speed limit UDP Packet loss
  • Trigger soft interrupt packet loss
  • UDP Send buffer full
  • UDP Receive buffer full
  • TCP Full connection queue full
  • TCP Request overflow
  • The number of connections has reached the maximum
  • iptables policy Set relevant rules

Prerequisite

Log in to the instance before locating and handling the problem , For details, see Sign in Linux example   And Sign in Windows example .

Fault handling

Caused by triggering speed limit TCP Packet loss

ECS instances have multiple specifications , And different specifications have different network performance . When the bandwidth or packet volume of the instance exceeds the standard corresponding to the instance specification , It will trigger the speed limit on the platform side , Cause packet loss . The troubleshooting and handling steps are as follows :

  1. Check the bandwidth and packet volume of the instance .
    Linux Instance executable  sar -n DEV 2  Command to view bandwidth and packet size . among ,rxpck/s  and  txpck/s  The indicator is the number of packets sent and received ,rxkB/s  and  txkB/s  The indicator is the transceiver bandwidth .
  2. Compare the obtained bandwidth and packet size data Example specifications , Check whether the instance specification and performance bottleneck are reached .
  • yes , You need to upgrade the instance specification or adjust the business volume .
  • no , If the instance specification is not reached, the performance bottleneck , You can go through Submit work order   Further positioning processing .

Caused by triggering speed limit UDP Packet loss

Reference resources Caused by triggering speed limit TCP Packet loss   step , Judge whether the packet loss is caused by the performance bottleneck of the instance specification .

  • yes , You need to upgrade the instance specification or adjust the business volume .
  • If the instance specification is not reached, the performance bottleneck , The platform may be responsible for DNS Requests for additional frequency limits cause . When the overall bandwidth or packet volume of the instance reaches the performance bottleneck of the instance specification , May trigger DNS Request speed limit and appear UDP Packet loss . It can be done by   Submit work order   Further positioning processing .

Trigger soft interrupt packet loss

When the operating system detects  /proc/net/softnet_stat  The second column of counts increases , It will be judged as “ Soft interrupt packet loss ”. When your instance triggers soft interrupt packet loss , The following steps can be used for troubleshooting and handling :
Check to see if it's on RPS:

  • Turn on , Then the kernel parameters  net.core.netdev_max_backlog  Packet loss will be caused when it is too small , Need to be enlarged . For details of kernel parameters, see Linux Examples: introduction to common kernel parameters .
  • Did not open , Then check whether it is CPU Single core soft interrupt high , Result in failure to send and receive data in time . if , You can :
  • Choose to open RPS, Make soft interrupt allocation more balanced .
  • Check whether the business process will cause uneven distribution of soft interrupts .

UDP Send buffer full

If your instance is due to UDP When packet loss is caused by insufficient buffer , Troubleshooting can be carried out through the following steps :

  1. Use  ss -nump  Command view UDP Whether the send buffer is full .
  2. if , Then increase the kernel parameters  net.core.wmem_max  and  net.core.wmem_default, And restart UDP Procedures to take effect . For details of kernel parameters, see Linux Examples: introduction to common kernel parameters .
  3. If there is still packet loss , You can go through  ss -nump  Command to see if the send buffer is not growing as expected . At this point, you need to check whether the business code passes setsockopt Set up SO_SNDBUF. if , Please modify the code to increase SO_SNDBUF.

UDP Receive buffer full

If your instance is due to UDP When packet loss is caused by insufficient buffer , It can be handled through the following steps :

  1. Use  ss -nump  Command view UDP Whether the receive buffer is full .
  2. if , Then increase the kernel parameters  net.core.rmem_max  and  net.core.rmem_default, And restart UDP Procedures to take effect . For details of kernel parameters, see Linux Examples: introduction to common kernel parameters .
  3. If there is still packet loss , You can go through  ss -nump  Command to see if the receive buffer is not growing as expected . At this point, you need to check whether the business code passes setsockopt Set up SO_RCVBUF. if , Please modify the code to increase SO_RCVBUF.

TCP Full connection queue full

TCP The length of the full connection queue is taken as  net.core.somaxconn  And business process call listen When it came in backlog Parameters , The smaller of the two . If your instance happens TCP When the full connection queue is full, resulting in packet loss , It can be handled through the following steps :

  1. Increase kernel parameters  net.core.somaxconn. For details of kernel parameters, see Linux Examples: introduction to common kernel parameters .
  2. Check whether the business process has passed in backlog Parameters . if , Then turn it up accordingly .

TCP Request overflow

stay TCP When receiving data , if socket By user Lock the , The data will be sent to backlog queue . If this process fails , Will cause TCP Request overflow causes packet loss . Usually , Assume that the performance of the business process is normal , You can refer to the following methods to troubleshoot and deal with problems at the system level :

Check whether the business process passes setsockopt Set it yourself buffer size :

  • If set , And the value is not large enough , You can modify the business process to specify a larger value , Or no longer pass setsockopt Specify size .

    explain

    setsockopt The value of is controlled by the kernel parameter  net.core.rmem_max  and  net.core.wmem_max  Limit . While adjusting business procedures , Can be adjusted synchronously  net.core.rmem_max  and  net.core.wmem_max. After adjustment, please restart the business process to make the configuration effective .

  • If not set , You can turn it up  net.ipv4.tcp_memnet.ipv4.tcp_rmem  and  net.ipv4.tcp_wmem  Kernel parameters to adjust TCP socket Water level of .
    For kernel parameter modification, see Linux Examples: introduction to common kernel parameters .

The number of connections has reached the maximum

ECS instances have multiple specifications , And different specifications have different connection numbers and performance indicators . When the number of connections of the instance exceeds the standard corresponding to the instance specification , It will trigger the speed limit of the platform , Cause packet loss . The processing steps are as follows :

explain

The number of connections refers to the number of sessions of the ECS instance saved on the host , contain TCP、UDP and ICMP. This value is greater than the value passed on the ECS instance  ss  or  netstat  Number of network connections obtained by command .

Check the number of connections of your instance , And compare Example specifications , Check whether the instance specification and performance bottleneck are reached .

  • yes , You need to upgrade the instance specification or adjust the business volume .
  • no , If the instance specification is not reached, the performance bottleneck , You can go through Submit work order   Further positioning processing .

iptables policy Set relevant rules

On the cloud server iptables Without setting relevant rules , May be iptables policy The related rule settings cause the packets arriving at the ECs to be discarded . The processing steps are as follows :

  1. Execute the following command , see iptables policy The rules .

    iptables -L | grep policy 
    

iptables policy The default rule is ACCEPT. if INPUT chain policy Not ACCEPT, Will cause all packets to the server to be discarded . for example , If the following results are returned , Indicates that all packets entering the ECS will be drop.

Chain INPUT (policy DROP)
Chain FORWARD (policy ACCEPT)
Chain OUTPUT (policy ACCEPT)
  1. Execute the following command , Adjust as needed  -P  After the value of .

    iptables -P INPUT ACCEPT 
    

After the adjustment , Can be executed again step 1  Command view , The following results should be returned :

Chain INPUT (policy ACCEPT)
Chain FORWARD (policy ACCEPT)
Chain OUTPUT (policy ACCEPT)
原网站

版权声明
本文为[87cloud]所创,转载请带上原文链接,感谢
https://yzsam.com/2022/176/202206241946513192.html