当前位置:网站首页>Dpdk packet receiving and sending problem case: non packet receiving problem location triggered by mismatched packet sending and receiving function
Dpdk packet receiving and sending problem case: non packet receiving problem location triggered by mismatched packet sending and receiving function
2022-07-25 15:54:00 【longyu_ wlz】
Problem phenomenon
Business processes use x710 network card , After receiving and sending more than 10000 giant frame packets, you cannot receive packets normally , Check the statistics of network card receiving and sending, and find imissed Fields keep increasing , Problems must arise .
environmental information
- dpdk edition
dpdk-16.04 - network card pci Information
24:00.0 Ethernet controller: Intel Corporation Ethernet Controller X710 for 10GbE SFP+ (rev 02) 24:00.1 Ethernet controller: Intel Corporation Ethernet Controller X710 for 10GbE SFP+ (rev 02) - Interface statistics
First view :
Second view :rx_bytes: 124163766000 rx_packets: 13772588 rx_no_buffer_count: 0 rx_errors: 0 rx_missed_errors:13755556 tx_bytes: 33291648 tx_packets: 16224 tx_broadcast: 0 tx_multicast: 0 tx_dropped: 0 tx_errors: 0
From the above information ,tx_packets No growth ,rx_packet And rx_missed At the same time, it is growing ,rx_no_buffer_count by 0 Indicates that there is no mbuf Let the cat out of the , At the same time, confirm that the network traffic is very small ,rx_packets Data of minus rx_missed_errors The data is about 10000 5 Thousand or so , Judge that the problem is that the program cannot receive packets .rx_bytes: 124176852000 rx_packets: 13774042 rx_no_buffer_count: 0 rx_errors: 0 rx_missed_errors:13757010 tx_bytes: 33291648 tx_packets: 16224 tx_broadcast: 0 tx_multicast: 0 tx_dropped: 0 tx_errors: 0 - Receive package mempool Of mbuf_size by 2048
debug Process record
1. Check the interface receiving and sending function
Packet receiving function :i40e_recv_scattered_pkts
Contract function :i40e_xmit_pkts_vec
2. Check the interface rte_eth_devices[port]->data Important fields in
(gdb) print *rte_eth_devices[0]->data
$4 = {
name = "24:0.0", '\000' <repeats 25 times>,
rx_queues = 0x600025f12940,
tx_queues = 0x600025f128c0,
nb_rx_queues = 8,
nb_tx_queues = 8,
..................................
dev_private = 0x60000019d880,
dev_link = {
link_speed = 10000,
link_duplex = 1,
link_autoneg = 1,
link_status = 1
},
dev_conf = {
link_speeds = 0,
rxmode = {
mq_mode = ETH_MQ_RX_RSS,
max_rx_pkt_len = 9728,
split_hdr_size = 0,
header_split = 0,
hw_ip_checksum = 1,
hw_vlan_filter = 0,
hw_vlan_strip = 0,
hw_vlan_extend = 0,
jumbo_frame = 1,
hw_strip_crc = 0,
enable_scatter = 0,
enable_lro = 0,
enable_hash_offload = 0
},
txmode = {
mq_mode = ETH_MQ_TX_NONE,
pvid = 0,
hw_vlan_reject_tagged = 0 '\000',
hw_vlan_reject_untagged = 0 '\000',
hw_vlan_insert_pvid = 0 '\000'
},
.......................................
rx_mbuf_alloc_failed = 0,
.................................
port_id = 0 '\000',
promiscuous = 1 '\001',
scattered_rx = 1 '\001',
all_multicast = 0 '\000',
dev_started = 1 '\001',
from data The structure can confirm the following information :
- Interface obfuscation mode is on
- The interface is normal up And link Status as up
- The interface receiving and sending queue configuration is normal
- network card rx_mode Enabled in hw_ip_checksum、jumbo_frame, And will support the maximum packet size max_rx_pkt_len Set up in order to 9728.
3. View the important fields of the packet receiving queue
(gdb) print *(struct i40e_rx_queue *)rte_eth_devices[0]->data->rx_queues[0]
$8 = {
mp = 0x60002d1f9ec0,
rx_ring = 0x60002624ee80,
rx_ring_phys_addr = 937750144,
sw_ring = 0x60002624dd40,
nb_rx_desc = 512,
rx_free_thresh = 32,
rx_tail = 412,
...............................
rx_free_trigger = 31,
...............................
rxrearm_nb = 0,
rxrearm_start = 0,
mbuf_initializer = 0,
port_id = 0 '\000',
crc_len = 4 '\004',
queue_id = 0,
reg_idx = 1,
drop_en = 0 '\000',
qrx_tail = 0x600064128004 "\230\001",
vsi = 0x60000011bb40,
rx_buf_len = 2048,
rx_hdr_len = 0,
max_pkt_len = 9728,
.........................
rx_ring And sw_ring The configuration is normal , Descriptor size is 512 individual , No obvious abnormality is found in other fields , At the same time, notice rx_buf_len Set the size to 2048,max_pkt_len Set to 9728, In this configuration, the network card will use multiple mbuf Save the giant frame , The packet receiving function of the driver will use xxx_recv_scattered_xxx type .
see rx_tail Point to the rx Descriptor data :
(gdb) print /x ((struct i40e_rx_queue *)rte_eth_devices[1]->data->rx_queues[0])->rx_ring[412]
$55 = {
read = {
pkt_addr = 0xe8e3e8e300000000,
hdr_addr = 0x2000600003019,
rsvd1 = 0x0,
rsvd2 = 0x0
},
wb = {
qword0 = {
lo_dword = {
mirr_fcoe = {
mirroring_status = 0x0,
fcoe_ctx_id = 0x0
},
l2tag1 = 0x0
},
hi_dword = {
rss = 0xe8e3e8e3,
fcoe_param = 0xe8e3e8e3,
fd_id = 0xe8e3e8e3
}
},
qword1 = {
status_error_len = 0x2000600003019
},
.....................................
}
}
In the above information pkt_addr The address of is obviously abnormal .
Ask questions
Is there any update to the higher version of the driver used ?
No updates , The code of the package receiving function is exactly the same .
debug Observed doubts
rx_tail The message address on the next packet descriptor to be processed is abnormal , Analyze the driver code , This should not exist , It indicates that there may be an exception in the network card filling descriptor .pcie Is there a problem with the side ?
lspci -nvv Check the abnormal interface status , Compare with the normal interface , No abnormal points are found , Exclude for the time being .
Try exception recovery
The following tests were performed :
- Interface down、up Can't return to normal
- After restarting the engine, you still can't receive tens of thousands of packets
If it is pcie abnormal , Restarting the engine to reinitialize the driver cannot be resumed , The above test further ruled out pcie The problem of , It is suspected that the network card has been in some abnormal working state .
l2fwd Problems found in the control test
Considering that there may be other factors interfering and the problem must appear , I judge if it is the problem on the drive side , Use l2fwd Theoretically, it should also be able to reproduce problems . So the following tests were carried out :
In the configuration rx_mode Turn on hw_ip_checksum And jumbo_frame And set up max_rx_pkt_len by 9728 Under the condition of , The following comparative experiments were carried out :
Test creation mbuf dataroom by 9728 + headroom The size of mempool
Normal receiving and contracting .establish mbuf dataroom by 2048 + headroom The size of mempool
After receiving and sending hundreds of packets, you can't receive packets .stay 2 On the basis of , take l2fwd Change to the working mode of only receiving packages but not contracting , Be able to receive packets continuously , It shows that the problem lies in the contract .
After the above tests , The problem narrowed down to l2fwd Used by the contractor mempool Created mbuf The structure of the dataroom In size , But how does this size affect the collection and contracting process ?
dpdk There are two working modes for processing giant frames in
- The giant frame consists of a mbuf Of dataroom preservation , This is called singlesegs, Indicates by a mbuf Load message
- The huge frame is split into multiple mbuf, Piece together mbuf Save in the form of a chain , This approach can be called multisegs Pattern , Indicates that there are multiple mbuf The linked list is composed of messages
xxx_recv_scattered_xxx This packet receiving function supports multisegs Pattern , It requires matching support multisegs The contracting function of the pattern , Otherwise, abnormal contracting will occur .
How does the network card know which mode to use ?
In fact, the driver side calls on the upper layer xxx_dev_start Interface , Each packet receiving queue will be rx_buf_len Write to the network card register , The network card register checks the size and mtu Can determine which working mode to use to process messages .
When rx_buf_len Greater than or equal to mtu When , Use singlesegs, When rx_buf_len Less than mtu When you use multisegs Pattern . At the same time, the drive side will also do a similar check to use the matching packet receiving function , But the choice of the contracting function does not use these conditions , This may lead to misuse .
Which one? i40e The contract function supports multisegs Pattern ?
Write here , In retrospect debug Check the content of the interface receiving and sending function in the record :
- Packet receiving function :i40e_recv_scattered_pkts
- Contract function :i40e_xmit_pkts_vec
Combined with queue information rx_buf_len The value of is 2048, and max_rx_pkt_len by 9728, The multisegs Mode to receive packets . Read the code to make sure that only i40e_xmit_pkts The contract function supports multisegs Pattern ,i40e_xmit_pkts_vec The contracting function does not support multisegs, These two functions do not match , It will trigger the abnormal contracting, which will cause the network card to work abnormally and then fail to receive packets .
What then? l2fwd Will use i40e_xmit_pkts_vec The contract function ?
The analysis code has the following conclusions :
- l2fwd Of tx_queue_setup in call tx_conf Configure to NULL
- rte_eth_tx_queue_setup Function to determine tx_conf If it is empty, the driven dev_infos_get Interface to get the default txconf To configure , Then call... In the underlying driver tx_queue_setup function
- i40e The default in the driver txconf Configuration will txq_flags Configure to ETH_TXQ_FLAGS_NOMULTSEGS |
ETH_TXQ_FLAGS_NOOFFLOADS - i40e Drive to judge when setting the contract function txq_flags Open the ETH_TXQ_FLAGS_NOMULTSEGS |
ETH_TXQ_FLAGS_NOOFFLOADS Is not supported multsegs The contract function of
I'm sure it only needs to be in tx_queue_setup Pass a tx_conf Avoid using default values to make l2fwd Use support multsegs The contracting function of the pattern , For this reason l2fwd The code is modified as follows :
Index: main.c
===================================================================
--- main.c
+++ main.c
@@ -114,10 +114,11 @@
.rxmode = {
.split_hdr_size = 0,
.header_split = 0, /**< Header Split disabled */
- .hw_ip_checksum = 0, /**< IP checksum offload disabled */
+ .hw_ip_checksum = 1, /**< IP checksum offload disabled */
.hw_vlan_filter = 0, /**< VLAN filtering disabled */
- .jumbo_frame = 0, /**< Jumbo Frame Support disabled */
+ .jumbo_frame = 1, /**< Jumbo Frame Support disabled */
.hw_strip_crc = 0, /**< CRC stripped by hardware */
+ .max_rx_pkt_len = 9728,
},
.txmode = {
.mq_mode = ETH_MQ_TX_NONE,
@@ -225,12 +226,6 @@
BURST_TX_DRAIN_US;
struct rte_eth_dev_tx_buffer *buffer;
prev_tsc = 0;
timer_tsc = 0;
@@ -668,9 +661,13 @@
/* init one TX queue on each port */
fflush(stdout);
+ struct rte_eth_txconf txconf;
+ memset(&txconf, 0x00, sizeof(txconf));
+ txconf.txq_flags = 0;
+
ret = rte_eth_tx_queue_setup(portid, 0, nb_txd,
rte_eth_dev_socket_id(portid),
- NULL);
+ &txconf);
if (ret < 0)
rte_exit(EXIT_FAILURE, "rte_eth_tx_queue_setup:err=%d, port=%u\n",
ret, (unsigned) portid);
Recompile and run after modification l2fwd, At this time, the receiving and contracting are normal , Use perf Hot function calls are observed as follows :
54.11% l2fwd [.] l2fwd_launch_one_lcore
45.22% l2fwd [.] i40e_recv_scattered_pkts
.......................................................
0.02% l2fwd [.] i40e_xmit_pkts
The packet receiving function uses i40e_recv_scattered_pkts, The contracting function is i40e_xmit_pkts, Both of these functions support multisegs, Problem solved . Check the business code according to this idea , Find business code calls tx_queue_setup Delivered on tx_conf The parameters are configured ETH_TXQ_FLAGS_NOMULTSEGS | ETH_TXQ_FLAGS_NOOFFLOADS, in the light of x710 Remove the corresponding configuration from the network card , Retest problem solved .
extend
Writing here, I think the implementation of driver is unreasonable , It should not be the default txq_flags Set to a specific value , The higher version should be modified , see dpdk git log Found in the following commit Removed from tx_conf Related configuration of :
commit 7d1daae8c7d28ae5b2b7fe054d1dc507edb405a9
Author: Qi Zhang <[email protected]>
Date: Wed May 2 11:56:33 2018 +0800
This commit The submission information of is as follows :
Since we move to new offload APIs, txq_flags is no long needed.
This patch remove the dependence on that.
This modification is not intended to solve the problem described in this article , Just to use the new offloads Structure is removed txq_flags Configuration of , And new offloads The default value of the structure is 0, There is no problem described in this article .
summary
In the final review, I found that the problem is debug The first information in the process view can be made clear , Only because I am not clear about the underlying principle, I ignore this information , It reminds me of “ Where there is no doubt ” This famous saying . Sometimes, judging certain information by yourself is no doubt, but a quick and easy conclusion , The problem may be implicit , This needs constant training to improve .
Think about if I'm right debug Every piece of information in the record is well known , The speed of solving such problems will be greatly accelerated , This may be the way to become a field expert .
边栏推荐
- Activity review | July 6 Anyuan AI X machine heart series lecture No. 2 | MIT professor Max tegmark shares "symbiotic evolution of human and AI"
- Beyond Compare 4 实现class文件对比【最新】
- Pytoch learning notes -- Summary of common functions of pytoch 1
- MySQL教程65-MySQL操作表中数据
- Cf888g clever dictionary tree + violent divide and conquer (XOR minimum spanning tree)
- Idea - click the file code to automatically synchronize with the directory
- HDD Hangzhou station · harmonyos technical experts share the features of Huawei deveco studio
- 2021 Shanghai sai-d-cartland number variant, DP
- LeetCode - 379 电话目录管理系统(设计)
- Leetcode - 622 design cycle queue (Design)
猜你喜欢

Gary Marcus: 学习语言比你想象的更难

Leetcode - 622 design cycle queue (Design)

不愧是阿里内部“千亿级并发系统架构设计笔记”面面俱到,太全了

Google Blog: training general agents with multi game decision transformer
![Beyond compare 4 realizes class file comparison [latest]](/img/ab/4babd7d4ee4ea132a6039858dd6451.png)
Beyond compare 4 realizes class file comparison [latest]

Games101 review: 3D transformation

Games101 review: linear algebra

CircleIndicator组件,使指示器风格更加多样化

Ml image depth learning and convolution neural network

Pytoch learning notes - Teacher Liu Er RNN advanced chapter - code comments and results
随机推荐
Cf365-e - Mishka and divisors, number theory +dp
LeetCode - 707 设计链表 (设计)
Data system partition design - Request Routing
Leetcode - 622 design cycle queue (Design)
不愧是阿里内部“千亿级并发系统架构设计笔记”面面俱到,太全了
JVM knowledge brain map sharing
LeetCode - 359 日志速率限制器 (设计)
Leetcode - 232 realize queue with stack (design double stack to realize queue)
Geogle Colab笔记1--运行Geogle云端硬盘上的.py文件
CircleIndicator组件,使指示器风格更加多样化
Okaleido launched the fusion mining mode, which is the only way for Oka to verify the current output
Matlab simulation of BPSK modulation system (1)
Cf888g clever dictionary tree + violent divide and conquer (XOR minimum spanning tree)
PAT甲级1153 Decode Registration Card of PAT (25 分)
30 lines write the concurrency tool class yourself (semaphore, cyclicbarrier, countdownlatch)
C # carefully sorting out key points of knowledge 11 entrustment and events (recommended Collection)
Pytoch learning notes -- Summary of common functions 2
MySQL - Summary of common SQL statements
Cf566a greed + dictionary tree
2600 pages in total! Another divine interview manual is available~