* XL710 with i40e driver drops packets on RX even on a small rates. @ 2016-08-22 12:06 Ilya Maximets 2017-01-03 12:18 ` Martin Weiser 0 siblings, 1 reply; 5+ messages in thread From: Ilya Maximets @ 2016-08-22 12:06 UTC (permalink / raw) To: dev@dpdk.org, Helin Zhang, Jingjing Wu; +Cc: Dyasly Sergey, Heetae Ahn Hello, All. I've faced with a really bad situation with packet drops on a small packet rates (~45 Kpps) while using XL710 NIC with i40e DPDK driver. The issue was found while testing PHY-VM-PHY scenario with OVS and confirmed on PHY-PHY scenario with testpmd. DPDK version 16.07 was used in all cases. XL710 firmware-version: f5.0.40043 a1.5 n5.04 e2505 Test description (PHY-PHY): * Following cmdline was used: # n_desc=2048 # ./testpmd -c 0xf -n 2 --socket-mem=8192,0 -w 0000:05:00.0 -v \ -- --burst=32 --txd=${n_desc} --rxd=${n_desc} \ --rxq=1 --txq=1 --nb-cores=1 \ --eth-peer=0,a0:00:00:00:00:00 --forward-mode=mac * DPDK-Pktgen application was used as a traffic generator. Single flow generated. Results: * Packet size: 128B, rate: 90% of 10Gbps (~7.5 Mpps): On the generator's side: Total counts: Tx : 759034368 packets Rx : 759033239 packets Lost : 1129 packets Average rates: Tx : 7590344 pps Rx : 7590332 pps Lost : 11 pps All of this dropped packets are RX-dropped on testpmd's side: +++++++++++++++ Accumulated forward statistics for all ports+++++++++++++++ RX-packets: 759033239 RX-dropped: 1129 RX-total: 759034368 TX-packets: 759033239 TX-dropped: 0 TX-total: 759033239 +++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ At the same time 10G NIC with IXGBE driver works perfectly without any packet drops in the same scenario. Much worse situation with PHY-VM-PHY scenario with OVS: * testpmd application used inside guest to forward incoming packets. (almost same cmdline as for PHY-PHY) * For packet size 256 B on rate 1% of 10Gbps (~45 Kpps): Total counts: Tx : 1358112 packets Rx : 1357990 packets Lost : 122 packets Average rates: Tx : 45270 pps Rx : 45266 pps Lost : 4 pps All of this 122 dropped packets can be found in rx_dropped counter: # ovs-vsctl get interface dpdk0 statistics:rx_dropped 122 And again, no issues with IXGBE on the exactly same scenario. Results of my investigation: * I found that all of this packets are 'imissed'. This means that rx descriptor ring was overflowed. * I've modified i40e driver to check the real number of free descriptors that was not still filled by the NIC and found that HW fills rx descriptors with uneven rate. Looks like it fills them using a huge batches. * So, root cause of packet drops with XL710 is somehow uneven rate of filling of the hw rx descriptors by the NIC. This leads to exhausting of rx descriptors and packet drops by the hardware. 10G IXGBE NIC works more smoothly and driver is able to refill hw ring with rx descriptors in time. * The issue becomes worse with OVS because of much bigger latencies between 'rte_eth_rx_burst()' calls. The easiest solution for this problem is to increase number of RX descriptors. Increasing up to 4096 eliminates packet drops but decreases the performance a lot: For OVS PHY-VM-PHY scenario by 10% For OVS PHY-PHY scenario by 20% For tespmd PHY-PHY scenario by 17% (22.1 Mpps --> 18.2 Mpps for 64B packets) As a result we have a trade-off between zero drop rate on small packet rates and the higher maximum performance that is very sad. Using of 16B descriptors doesn't really help with performance. Upgrading the firmware from version 4.4 to 5.04 didn't help with drops. Any thoughts? Can anyone reproduce this? Best regards, Ilya Maximets. ^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: XL710 with i40e driver drops packets on RX even on a small rates. 2016-08-22 12:06 XL710 with i40e driver drops packets on RX even on a small rates Ilya Maximets @ 2017-01-03 12:18 ` Martin Weiser 2017-01-04 12:33 ` Martin Weiser 0 siblings, 1 reply; 5+ messages in thread From: Martin Weiser @ 2017-01-03 12:18 UTC (permalink / raw) To: dev Hello, we are also seeing this issue on one of our test systems while it does not occur on other test systems with the same DPDK version (we tested 16.11 and current master). The system that we can reproduce this issue on also has a X552 ixgbe NIC which can forward the exact same traffic using the same testpmd parameters without a problem. Even if we install a 82599ES ixgbe NIC in the same PCI slot that the XL710 was in the 82599ES can forward the traffic without any drops. Like in the issue reported by Ilya all packet drops occur on the testpmd side and are accounted as 'imissed'. Increasing the number of rx descriptors only helps a little at low packet rates. Drops start occurring at pretty low packet rates like 100000 packets per second. Any suggestions would be greatly appreciated. Best regards, Martin On 22.08.16 14:06, Ilya Maximets wrote: > Hello, All. > > I've faced with a really bad situation with packet drops on a small > packet rates (~45 Kpps) while using XL710 NIC with i40e DPDK driver. > > The issue was found while testing PHY-VM-PHY scenario with OVS and > confirmed on PHY-PHY scenario with testpmd. > > DPDK version 16.07 was used in all cases. > XL710 firmware-version: f5.0.40043 a1.5 n5.04 e2505 > > Test description (PHY-PHY): > > * Following cmdline was used: > > # n_desc=2048 > # ./testpmd -c 0xf -n 2 --socket-mem=8192,0 -w 0000:05:00.0 -v \ > -- --burst=32 --txd=${n_desc} --rxd=${n_desc} \ > --rxq=1 --txq=1 --nb-cores=1 \ > --eth-peer=0,a0:00:00:00:00:00 --forward-mode=mac > > * DPDK-Pktgen application was used as a traffic generator. > Single flow generated. > > Results: > > * Packet size: 128B, rate: 90% of 10Gbps (~7.5 Mpps): > > On the generator's side: > > Total counts: > Tx : 759034368 packets > Rx : 759033239 packets > Lost : 1129 packets > > Average rates: > Tx : 7590344 pps > Rx : 7590332 pps > Lost : 11 pps > > All of this dropped packets are RX-dropped on testpmd's side: > > +++++++++++++++ Accumulated forward statistics for all ports+++++++++++++++ > RX-packets: 759033239 RX-dropped: 1129 RX-total: 759034368 > TX-packets: 759033239 TX-dropped: 0 TX-total: 759033239 > +++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ > > At the same time 10G NIC with IXGBE driver works perfectly > without any packet drops in the same scenario. > > Much worse situation with PHY-VM-PHY scenario with OVS: > > * testpmd application used inside guest to forward incoming packets. > (almost same cmdline as for PHY-PHY) > > * For packet size 256 B on rate 1% of 10Gbps (~45 Kpps): > > Total counts: > Tx : 1358112 packets > Rx : 1357990 packets > Lost : 122 packets > > Average rates: > Tx : 45270 pps > Rx : 45266 pps > Lost : 4 pps > > All of this 122 dropped packets can be found in rx_dropped counter: > > # ovs-vsctl get interface dpdk0 statistics:rx_dropped > 122 > > And again, no issues with IXGBE on the exactly same scenario. > > > Results of my investigation: > > * I found that all of this packets are 'imissed'. This means that rx > descriptor ring was overflowed. > > * I've modified i40e driver to check the real number of free descriptors > that was not still filled by the NIC and found that HW fills > rx descriptors with uneven rate. Looks like it fills them using > a huge batches. > > * So, root cause of packet drops with XL710 is somehow uneven rate of > filling of the hw rx descriptors by the NIC. This leads to exhausting > of rx descriptors and packet drops by the hardware. 10G IXGBE NIC works > more smoothly and driver is able to refill hw ring with rx descriptors > in time. > > * The issue becomes worse with OVS because of much bigger latencies > between 'rte_eth_rx_burst()' calls. > > The easiest solution for this problem is to increase number of RX descriptors. > Increasing up to 4096 eliminates packet drops but decreases the performance a lot: > > For OVS PHY-VM-PHY scenario by 10% > For OVS PHY-PHY scenario by 20% > For tespmd PHY-PHY scenario by 17% (22.1 Mpps --> 18.2 Mpps for 64B packets) > > As a result we have a trade-off between zero drop rate on small packet rates and > the higher maximum performance that is very sad. > > Using of 16B descriptors doesn't really help with performance. > Upgrading the firmware from version 4.4 to 5.04 didn't help with drops. > > Any thoughts? Can anyone reproduce this? > > Best regards, Ilya Maximets. ^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: XL710 with i40e driver drops packets on RX even on a small rates. 2017-01-03 12:18 ` Martin Weiser @ 2017-01-04 12:33 ` Martin Weiser 2017-01-06 9:17 ` Martin Weiser 0 siblings, 1 reply; 5+ messages in thread From: Martin Weiser @ 2017-01-04 12:33 UTC (permalink / raw) To: dev, Ilya Maximets; +Cc: Helin Zhang, Jingjing Wu Hello, I have performed some more thorough testing on 3 different machines to illustrate the strange results with XL710. Please note that all 3 systems were able to forward the traffic of Test 1 and Test 2 without packet loss when a 82599ES NIC was installed in the same PCI slot as the XL710 in the tests below. Here is the test setup and the test results: ## Test traffic In all tests the t-rex traffic generator was used to generate traffic on a XL710 card with the following parameters: ### Test 1 ./t-rex-64 -f cap2/imix_1518.yaml -c 4 -d 60 -m 25 --flip This resulted in a 60 second run with ~1.21 Gbps traffic on each of the two interfaces with ~100000 packets per second on each interface. ### Test 2 ./t-rex-64 -f cap2/imix_1518.yaml -c 4 -d 60 -m 100 --flip This resulted in a 60 second run with ~4.85 Gbps traffic on each of the two interfaces with ~400000 packets per second on each interface. ### Test 3 ./t-rex-64 -f cap2/imix_1518.yaml -c 4 -d 60 -m 400 --flip This resulted in a 60 second run with ~19.43 Gbps traffic on each of the two interfaces with ~1600000 packets per second on each interface. ## DPDK On all systems a vanilla DPDK v16.11 testpmd was used with the following parameters (PCI IDs differed between systems): ./build/app/testpmd -l 1,2 -w 0000:06:00.0 -w 0000:06:00.1 -- -i ## System 1 * Board: Supermicro X10SDV-TP8F * CPU: Architecture: x86_64 CPU op-mode(s): 32-bit, 64-bit Byte Order: Little Endian CPU(s): 8 On-line CPU(s) list: 0-7 Thread(s) per core: 2 Core(s) per socket: 4 Socket(s): 1 NUMA node(s): 1 Vendor ID: GenuineIntel CPU family: 6 Model: 86 Model name: Intel(R) Xeon(R) CPU D-1518 @ 2.20GHz Stepping: 3 CPU MHz: 800.250 CPU max MHz: 2200.0000 CPU min MHz: 800.0000 BogoMIPS: 4399.58 Virtualization: VT-x L1d cache: 32K L1i cache: 32K L2 cache: 256K L3 cache: 6144K NUMA node0 CPU(s): 0-7 Flags: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx pdpe1gb rdtscp lm constant_tsc arch_perfmon pebs bts rep_good nopl xtopology nonstop_tsc aperfmperf eagerfpu pni pclmulqdq dtes64 monitor ds_cpl vmx smx est tm2 ssse3 sdbg fma cx16 xtpr pdcm pcid dca sse4_1 sse4_2 x2apic movbe popcnt tsc_deadline_timer aes xsave avx f16c rdrand lahf_lm abm 3dnowprefetch epb intel_pt tpr_shadow vnmi flexpriority ept vpid fsgsbase tsc_adjust bmi1 hle avx2 smep bmi2 erms invpcid rtm cqm rdseed adx smap xsaveopt cqm_llc cqm_occup_llc cqm_mbm_total cqm_mbm_local dtherm arat pln pts * Memory channels: 2 * Memory: 2 * 8192 MB DDR4 @ 2133 MHz * NIC firmware: FW 5.0 API 1.5 NVM 05.00.04 eetrack 80002505 * i40e version: 1.4.25-k * OS: Ubuntu 16.04.1 LTS * Kernel: 4.4.0-57-generic * Kernel parameters: isolcpus=1,2,3,5,6,7 default_hugepagesz=1G hugepagesz=1G hugepages=1 ### Test 1 Mostly no packet loss. Sometimes ~10 packets missed of ~600000 on each interface when testpmd was not started in interactive mode. ### Test 2 100-300 packets of ~24000000 missed on each interface. ### Test 3 4000-5000 packets of ~96000000 missed on each interface. ## System 2 * Board: Supermicro X10SDV-7TP8F * CPU: Architecture: x86_64 CPU op-mode(s): 32-bit, 64-bit Byte Order: Little Endian CPU(s): 32 On-line CPU(s) list: 0-31 Thread(s) per core: 2 Core(s) per socket: 16 Socket(s): 1 NUMA node(s): 1 Vendor ID: GenuineIntel CPU family: 6 Model: 86 Model name: 06/56 Stepping: 4 CPU MHz: 1429.527 CPU max MHz: 2300.0000 CPU min MHz: 800.0000 BogoMIPS: 3400.37 Virtualization: VT-x L1d cache: 32K L1i cache: 32K L2 cache: 256K L3 cache: 24576K NUMA node0 CPU(s): 0-31 Flags: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx pdpe1gb rdtscp lm constant_tsc arch_perfmon pebs bts rep_good nopl xtopology nonstop_tsc aperfmperf eagerfpu pni pclmulqdq dtes64 monitor ds_cpl vmx smx est tm2 ssse3 sdbg fma cx16 xtpr pdcm pcid dca sse4_1 sse4_2 x2apic movbe popcnt tsc_deadline_timer aes xsave avx f16c rdrand lahf_lm abm 3dnowprefetch epb intel_pt tpr_shadow vnmi flexpriority ept vpid fsgsbase tsc_adjust bmi1 hle avx2 smep bmi2 erms invpcid rtm cqm rdseed adx smap xsaveopt cqm_llc cqm_occup_llc cqm_mbm_total cqm_mbm_local dtherm ida arat pln pts * Memory channels: 2 * Memory: 4 * 16384 MB DDR4 @ 2133 MHz * NIC firmware: FW 5.0 API 1.5 NVM 05.00.04 eetrack 80002505 * i40e version: 1.4.25-k * OS: Ubuntu 16.04.1 LTS * Kernel: 4.4.0-57-generic * Kernel parameters: isolcpus=1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,17,18,19,20,21,22,23,24,25,26,27,28,29,30,31 default_hugepagesz=1G hugepagesz=1G hugepages=1 ### Test 1 Mostly no packet loss of ~600000. ### Test 2 400000-500000 packets of ~24000000 missed on each interface. ### Test 3 1200000-1400000 packets of ~96000000 missed on each interface. ## System 3 * Board: Supermicro X9SRW-F * CPU: Architecture: x86_64 CPU op-mode(s): 32-bit, 64-bit Byte Order: Little Endian CPU(s): 12 On-line CPU(s) list: 0-11 Thread(s) per core: 2 Core(s) per socket: 6 Socket(s): 1 NUMA node(s): 1 Vendor ID: GenuineIntel CPU family: 6 Model: 62 Model name: Intel(R) Xeon(R) CPU E5-1650 v2 @ 3.50GHz Stepping: 4 CPU MHz: 1200.253 CPU max MHz: 3900.0000 CPU min MHz: 1200.0000 BogoMIPS: 7000.29 Virtualization: VT-x L1d cache: 32K L1i cache: 32K L2 cache: 256K L3 cache: 12288K NUMA node0 CPU(s): 0-11 Flags: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx pdpe1gb rdtscp lm constant_tsc arch_perfmon pebs bts rep_good nopl xtopology nonstop_tsc aperfmperf eagerfpu pni pclmulqdq dtes64 monitor ds_cpl vmx smx est tm2 ssse3 cx16 xtpr pdcm pcid dca sse4_1 sse4_2 x2apic popcnt tsc_deadline_timer aes xsave avx f16c rdrand lahf_lm epb tpr_shadow vnmi flexpriority ept vpid fsgsbase smep erms xsaveopt dtherm arat pln pts * Memory channels: 4 * Memory: 4 * 8192 MB DDR3 @ 1600 MHz * NIC firmware: FW 5.0 API 1.5 NVM 05.00.04 eetrack 80002537 * i40e version: 1.4.25-k * OS: Ubuntu 16.04.1 LTS * Kernel: 4.4.0-57-generic * Kernel parameters: default_hugepagesz=1G hugepagesz=1G hugepages=1 isolcpus=1-5,7-11 ### Test 1 No packets lost. ### Test 2 No packets lost. ### Test 3 No packets lost. Best regards, Martin On 03.01.17 13:18, Martin Weiser wrote: > Hello, > > we are also seeing this issue on one of our test systems while it does > not occur on other test systems with the same DPDK version (we tested > 16.11 and current master). > > The system that we can reproduce this issue on also has a X552 ixgbe NIC > which can forward the exact same traffic using the same testpmd > parameters without a problem. > Even if we install a 82599ES ixgbe NIC in the same PCI slot that the > XL710 was in the 82599ES can forward the traffic without any drops. > > Like in the issue reported by Ilya all packet drops occur on the testpmd > side and are accounted as 'imissed'. Increasing the number of rx > descriptors only helps a little at low packet rates. > > Drops start occurring at pretty low packet rates like 100000 packets per > second. > > Any suggestions would be greatly appreciated. > > Best regards, > Martin > > > > On 22.08.16 14:06, Ilya Maximets wrote: >> Hello, All. >> >> I've faced with a really bad situation with packet drops on a small >> packet rates (~45 Kpps) while using XL710 NIC with i40e DPDK driver. >> >> The issue was found while testing PHY-VM-PHY scenario with OVS and >> confirmed on PHY-PHY scenario with testpmd. >> >> DPDK version 16.07 was used in all cases. >> XL710 firmware-version: f5.0.40043 a1.5 n5.04 e2505 >> >> Test description (PHY-PHY): >> >> * Following cmdline was used: >> >> # n_desc=2048 >> # ./testpmd -c 0xf -n 2 --socket-mem=8192,0 -w 0000:05:00.0 -v \ >> -- --burst=32 --txd=${n_desc} --rxd=${n_desc} \ >> --rxq=1 --txq=1 --nb-cores=1 \ >> --eth-peer=0,a0:00:00:00:00:00 --forward-mode=mac >> >> * DPDK-Pktgen application was used as a traffic generator. >> Single flow generated. >> >> Results: >> >> * Packet size: 128B, rate: 90% of 10Gbps (~7.5 Mpps): >> >> On the generator's side: >> >> Total counts: >> Tx : 759034368 packets >> Rx : 759033239 packets >> Lost : 1129 packets >> >> Average rates: >> Tx : 7590344 pps >> Rx : 7590332 pps >> Lost : 11 pps >> >> All of this dropped packets are RX-dropped on testpmd's side: >> >> +++++++++++++++ Accumulated forward statistics for all ports+++++++++++++++ >> RX-packets: 759033239 RX-dropped: 1129 RX-total: 759034368 >> TX-packets: 759033239 TX-dropped: 0 TX-total: 759033239 >> +++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ >> >> At the same time 10G NIC with IXGBE driver works perfectly >> without any packet drops in the same scenario. >> >> Much worse situation with PHY-VM-PHY scenario with OVS: >> >> * testpmd application used inside guest to forward incoming packets. >> (almost same cmdline as for PHY-PHY) >> >> * For packet size 256 B on rate 1% of 10Gbps (~45 Kpps): >> >> Total counts: >> Tx : 1358112 packets >> Rx : 1357990 packets >> Lost : 122 packets >> >> Average rates: >> Tx : 45270 pps >> Rx : 45266 pps >> Lost : 4 pps >> >> All of this 122 dropped packets can be found in rx_dropped counter: >> >> # ovs-vsctl get interface dpdk0 statistics:rx_dropped >> 122 >> >> And again, no issues with IXGBE on the exactly same scenario. >> >> >> Results of my investigation: >> >> * I found that all of this packets are 'imissed'. This means that rx >> descriptor ring was overflowed. >> >> * I've modified i40e driver to check the real number of free descriptors >> that was not still filled by the NIC and found that HW fills >> rx descriptors with uneven rate. Looks like it fills them using >> a huge batches. >> >> * So, root cause of packet drops with XL710 is somehow uneven rate of >> filling of the hw rx descriptors by the NIC. This leads to exhausting >> of rx descriptors and packet drops by the hardware. 10G IXGBE NIC works >> more smoothly and driver is able to refill hw ring with rx descriptors >> in time. >> >> * The issue becomes worse with OVS because of much bigger latencies >> between 'rte_eth_rx_burst()' calls. >> >> The easiest solution for this problem is to increase number of RX descriptors. >> Increasing up to 4096 eliminates packet drops but decreases the performance a lot: >> >> For OVS PHY-VM-PHY scenario by 10% >> For OVS PHY-PHY scenario by 20% >> For tespmd PHY-PHY scenario by 17% (22.1 Mpps --> 18.2 Mpps for 64B packets) >> >> As a result we have a trade-off between zero drop rate on small packet rates and >> the higher maximum performance that is very sad. >> >> Using of 16B descriptors doesn't really help with performance. >> Upgrading the firmware from version 4.4 to 5.04 didn't help with drops. >> >> Any thoughts? Can anyone reproduce this? >> >> Best regards, Ilya Maximets. ^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: XL710 with i40e driver drops packets on RX even on a small rates. 2017-01-04 12:33 ` Martin Weiser @ 2017-01-06 9:17 ` Martin Weiser 2017-01-06 9:45 ` Zhang, Helin 0 siblings, 1 reply; 5+ messages in thread From: Martin Weiser @ 2017-01-06 9:17 UTC (permalink / raw) To: dev, Ilya Maximets; +Cc: Helin Zhang, Jingjing Wu Hello, just to let you know we were finally able to resolve the issue. It seems that the affected boards had a firmware issue with PCIe x8 v3. When we forced the PCI slots to run at x8 v2 the issue disappeared for Test 1 and Test 2. Test 3 still produced missed packets but probably due to the reduced PCIe x8 v2 bandwidth. We then found out that there exists a BIOS/firmware update for these boards which was issued by Supermicro in November ... unfortunately there are no changenotes whatsoever. But lo and behold this update seems to include a fix for exactly this issue since now the XL710 is working as expected with PCIe x8 v3. Best regards, Martin On 04.01.17 13:33, Martin Weiser wrote: > Hello, > > I have performed some more thorough testing on 3 different machines to > illustrate the strange results with XL710. > Please note that all 3 systems were able to forward the traffic of Test > 1 and Test 2 without packet loss when a 82599ES NIC was installed in the > same PCI slot as the XL710 in the tests below. > > Here is the test setup and the test results: > > > ## Test traffic > > In all tests the t-rex traffic generator was used to generate traffic on > a XL710 card with the following parameters: > > ### Test 1 > > ./t-rex-64 -f cap2/imix_1518.yaml -c 4 -d 60 -m 25 --flip > > This resulted in a 60 second run with ~1.21 Gbps traffic on each of the > two interfaces with ~100000 packets per > second on each interface. > > ### Test 2 > > ./t-rex-64 -f cap2/imix_1518.yaml -c 4 -d 60 -m 100 --flip > > This resulted in a 60 second run with ~4.85 Gbps traffic on each of the > two interfaces with ~400000 packets per > second on each interface. > > ### Test 3 > > ./t-rex-64 -f cap2/imix_1518.yaml -c 4 -d 60 -m 400 --flip > > This resulted in a 60 second run with ~19.43 Gbps traffic on each of the > two interfaces with ~1600000 packets per > second on each interface. > > > > ## DPDK > > On all systems a vanilla DPDK v16.11 testpmd was used with the following > parameters (PCI IDs differed between systems): > > ./build/app/testpmd -l 1,2 -w 0000:06:00.0 -w 0000:06:00.1 -- -i > > > > ## System 1 > > * Board: Supermicro X10SDV-TP8F > * CPU: > Architecture: x86_64 > CPU op-mode(s): 32-bit, 64-bit > Byte Order: Little Endian > CPU(s): 8 > On-line CPU(s) list: 0-7 > Thread(s) per core: 2 > Core(s) per socket: 4 > Socket(s): 1 > NUMA node(s): 1 > Vendor ID: GenuineIntel > CPU family: 6 > Model: 86 > Model name: Intel(R) Xeon(R) CPU D-1518 @ 2.20GHz > Stepping: 3 > CPU MHz: 800.250 > CPU max MHz: 2200.0000 > CPU min MHz: 800.0000 > BogoMIPS: 4399.58 > Virtualization: VT-x > L1d cache: 32K > L1i cache: 32K > L2 cache: 256K > L3 cache: 6144K > NUMA node0 CPU(s): 0-7 > Flags: fpu vme de pse tsc msr pae mce cx8 apic > sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht > tm pbe syscall nx pdpe1gb rdtscp lm constant_tsc arch_perfmon pebs bts > rep_good nopl xtopology nonstop_tsc aperfmperf eagerfpu pni pclmulqdq > dtes64 monitor ds_cpl vmx smx est tm2 ssse3 sdbg fma cx16 xtpr pdcm pcid > dca sse4_1 sse4_2 x2apic movbe popcnt tsc_deadline_timer aes xsave avx > f16c rdrand lahf_lm abm 3dnowprefetch epb intel_pt tpr_shadow vnmi > flexpriority ept vpid fsgsbase tsc_adjust bmi1 hle avx2 smep bmi2 erms > invpcid rtm cqm rdseed adx smap xsaveopt cqm_llc cqm_occup_llc > cqm_mbm_total cqm_mbm_local dtherm arat pln pts > * Memory channels: 2 > * Memory: 2 * 8192 MB DDR4 @ 2133 MHz > * NIC firmware: FW 5.0 API 1.5 NVM 05.00.04 eetrack 80002505 > * i40e version: 1.4.25-k > * OS: Ubuntu 16.04.1 LTS > * Kernel: 4.4.0-57-generic > * Kernel parameters: isolcpus=1,2,3,5,6,7 default_hugepagesz=1G > hugepagesz=1G hugepages=1 > > ### Test 1 > > Mostly no packet loss. Sometimes ~10 packets missed of ~600000 on each > interface when testpmd was not started in > interactive mode. > > ### Test 2 > > 100-300 packets of ~24000000 missed on each interface. > > ### Test 3 > > 4000-5000 packets of ~96000000 missed on each interface. > > > > ## System 2 > > * Board: Supermicro X10SDV-7TP8F > * CPU: > Architecture: x86_64 > CPU op-mode(s): 32-bit, 64-bit > Byte Order: Little Endian > CPU(s): 32 > On-line CPU(s) list: 0-31 > Thread(s) per core: 2 > Core(s) per socket: 16 > Socket(s): 1 > NUMA node(s): 1 > Vendor ID: GenuineIntel > CPU family: 6 > Model: 86 > Model name: 06/56 > Stepping: 4 > CPU MHz: 1429.527 > CPU max MHz: 2300.0000 > CPU min MHz: 800.0000 > BogoMIPS: 3400.37 > Virtualization: VT-x > L1d cache: 32K > L1i cache: 32K > L2 cache: 256K > L3 cache: 24576K > NUMA node0 CPU(s): 0-31 > Flags: fpu vme de pse tsc msr pae mce cx8 apic > sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht > tm pbe syscall nx pdpe1gb rdtscp lm constant_tsc arch_perfmon pebs bts > rep_good nopl xtopology nonstop_tsc aperfmperf eagerfpu pni pclmulqdq > dtes64 monitor ds_cpl vmx smx est tm2 ssse3 sdbg fma cx16 xtpr pdcm pcid > dca sse4_1 sse4_2 x2apic movbe popcnt tsc_deadline_timer aes xsave avx > f16c rdrand lahf_lm abm 3dnowprefetch epb intel_pt tpr_shadow vnmi > flexpriority ept vpid fsgsbase tsc_adjust bmi1 hle avx2 smep bmi2 erms > invpcid rtm cqm rdseed adx smap xsaveopt cqm_llc cqm_occup_llc > cqm_mbm_total cqm_mbm_local dtherm ida arat pln pts > * Memory channels: 2 > * Memory: 4 * 16384 MB DDR4 @ 2133 MHz > * NIC firmware: FW 5.0 API 1.5 NVM 05.00.04 eetrack 80002505 > * i40e version: 1.4.25-k > * OS: Ubuntu 16.04.1 LTS > * Kernel: 4.4.0-57-generic > * Kernel parameters: > isolcpus=1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,17,18,19,20,21,22,23,24,25,26,27,28,29,30,31 > default_hugepagesz=1G hugepagesz=1G hugepages=1 > > ### Test 1 > > Mostly no packet loss of ~600000. > > ### Test 2 > > 400000-500000 packets of ~24000000 missed on each interface. > > ### Test 3 > > 1200000-1400000 packets of ~96000000 missed on each interface. > > > > ## System 3 > > * Board: Supermicro X9SRW-F > * CPU: > Architecture: x86_64 > CPU op-mode(s): 32-bit, 64-bit > Byte Order: Little Endian > CPU(s): 12 > On-line CPU(s) list: 0-11 > Thread(s) per core: 2 > Core(s) per socket: 6 > Socket(s): 1 > NUMA node(s): 1 > Vendor ID: GenuineIntel > CPU family: 6 > Model: 62 > Model name: Intel(R) Xeon(R) CPU E5-1650 v2 @ 3.50GHz > Stepping: 4 > CPU MHz: 1200.253 > CPU max MHz: 3900.0000 > CPU min MHz: 1200.0000 > BogoMIPS: 7000.29 > Virtualization: VT-x > L1d cache: 32K > L1i cache: 32K > L2 cache: 256K > L3 cache: 12288K > NUMA node0 CPU(s): 0-11 > Flags: fpu vme de pse tsc msr pae mce cx8 apic > sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht > tm pbe syscall nx pdpe1gb rdtscp lm constant_tsc arch_perfmon pebs bts > rep_good nopl xtopology nonstop_tsc aperfmperf eagerfpu pni pclmulqdq > dtes64 monitor ds_cpl vmx smx est tm2 ssse3 cx16 xtpr pdcm pcid dca > sse4_1 sse4_2 x2apic popcnt tsc_deadline_timer aes xsave avx f16c rdrand > lahf_lm epb tpr_shadow vnmi flexpriority ept vpid fsgsbase smep erms > xsaveopt dtherm arat pln pts > * Memory channels: 4 > * Memory: 4 * 8192 MB DDR3 @ 1600 MHz > * NIC firmware: FW 5.0 API 1.5 NVM 05.00.04 eetrack 80002537 > * i40e version: 1.4.25-k > * OS: Ubuntu 16.04.1 LTS > * Kernel: 4.4.0-57-generic > * Kernel parameters: default_hugepagesz=1G hugepagesz=1G hugepages=1 > isolcpus=1-5,7-11 > > ### Test 1 > > No packets lost. > > ### Test 2 > > No packets lost. > > ### Test 3 > > No packets lost. > > > > Best regards, > Martin > > > > On 03.01.17 13:18, Martin Weiser wrote: >> Hello, >> >> we are also seeing this issue on one of our test systems while it does >> not occur on other test systems with the same DPDK version (we tested >> 16.11 and current master). >> >> The system that we can reproduce this issue on also has a X552 ixgbe NIC >> which can forward the exact same traffic using the same testpmd >> parameters without a problem. >> Even if we install a 82599ES ixgbe NIC in the same PCI slot that the >> XL710 was in the 82599ES can forward the traffic without any drops. >> >> Like in the issue reported by Ilya all packet drops occur on the testpmd >> side and are accounted as 'imissed'. Increasing the number of rx >> descriptors only helps a little at low packet rates. >> >> Drops start occurring at pretty low packet rates like 100000 packets per >> second. >> >> Any suggestions would be greatly appreciated. >> >> Best regards, >> Martin >> >> >> >> On 22.08.16 14:06, Ilya Maximets wrote: >>> Hello, All. >>> >>> I've faced with a really bad situation with packet drops on a small >>> packet rates (~45 Kpps) while using XL710 NIC with i40e DPDK driver. >>> >>> The issue was found while testing PHY-VM-PHY scenario with OVS and >>> confirmed on PHY-PHY scenario with testpmd. >>> >>> DPDK version 16.07 was used in all cases. >>> XL710 firmware-version: f5.0.40043 a1.5 n5.04 e2505 >>> >>> Test description (PHY-PHY): >>> >>> * Following cmdline was used: >>> >>> # n_desc=2048 >>> # ./testpmd -c 0xf -n 2 --socket-mem=8192,0 -w 0000:05:00.0 -v \ >>> -- --burst=32 --txd=${n_desc} --rxd=${n_desc} \ >>> --rxq=1 --txq=1 --nb-cores=1 \ >>> --eth-peer=0,a0:00:00:00:00:00 --forward-mode=mac >>> >>> * DPDK-Pktgen application was used as a traffic generator. >>> Single flow generated. >>> >>> Results: >>> >>> * Packet size: 128B, rate: 90% of 10Gbps (~7.5 Mpps): >>> >>> On the generator's side: >>> >>> Total counts: >>> Tx : 759034368 packets >>> Rx : 759033239 packets >>> Lost : 1129 packets >>> >>> Average rates: >>> Tx : 7590344 pps >>> Rx : 7590332 pps >>> Lost : 11 pps >>> >>> All of this dropped packets are RX-dropped on testpmd's side: >>> >>> +++++++++++++++ Accumulated forward statistics for all ports+++++++++++++++ >>> RX-packets: 759033239 RX-dropped: 1129 RX-total: 759034368 >>> TX-packets: 759033239 TX-dropped: 0 TX-total: 759033239 >>> +++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ >>> >>> At the same time 10G NIC with IXGBE driver works perfectly >>> without any packet drops in the same scenario. >>> >>> Much worse situation with PHY-VM-PHY scenario with OVS: >>> >>> * testpmd application used inside guest to forward incoming packets. >>> (almost same cmdline as for PHY-PHY) >>> >>> * For packet size 256 B on rate 1% of 10Gbps (~45 Kpps): >>> >>> Total counts: >>> Tx : 1358112 packets >>> Rx : 1357990 packets >>> Lost : 122 packets >>> >>> Average rates: >>> Tx : 45270 pps >>> Rx : 45266 pps >>> Lost : 4 pps >>> >>> All of this 122 dropped packets can be found in rx_dropped counter: >>> >>> # ovs-vsctl get interface dpdk0 statistics:rx_dropped >>> 122 >>> >>> And again, no issues with IXGBE on the exactly same scenario. >>> >>> >>> Results of my investigation: >>> >>> * I found that all of this packets are 'imissed'. This means that rx >>> descriptor ring was overflowed. >>> >>> * I've modified i40e driver to check the real number of free descriptors >>> that was not still filled by the NIC and found that HW fills >>> rx descriptors with uneven rate. Looks like it fills them using >>> a huge batches. >>> >>> * So, root cause of packet drops with XL710 is somehow uneven rate of >>> filling of the hw rx descriptors by the NIC. This leads to exhausting >>> of rx descriptors and packet drops by the hardware. 10G IXGBE NIC works >>> more smoothly and driver is able to refill hw ring with rx descriptors >>> in time. >>> >>> * The issue becomes worse with OVS because of much bigger latencies >>> between 'rte_eth_rx_burst()' calls. >>> >>> The easiest solution for this problem is to increase number of RX descriptors. >>> Increasing up to 4096 eliminates packet drops but decreases the performance a lot: >>> >>> For OVS PHY-VM-PHY scenario by 10% >>> For OVS PHY-PHY scenario by 20% >>> For tespmd PHY-PHY scenario by 17% (22.1 Mpps --> 18.2 Mpps for 64B packets) >>> >>> As a result we have a trade-off between zero drop rate on small packet rates and >>> the higher maximum performance that is very sad. >>> >>> Using of 16B descriptors doesn't really help with performance. >>> Upgrading the firmware from version 4.4 to 5.04 didn't help with drops. >>> >>> Any thoughts? Can anyone reproduce this? >>> >>> Best regards, Ilya Maximets. ^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: XL710 with i40e driver drops packets on RX even on a small rates. 2017-01-06 9:17 ` Martin Weiser @ 2017-01-06 9:45 ` Zhang, Helin 0 siblings, 0 replies; 5+ messages in thread From: Zhang, Helin @ 2017-01-06 9:45 UTC (permalink / raw) To: Martin Weiser, dev@dpdk.org, Ilya Maximets; +Cc: Wu, Jingjing Very good to know that! Congratulations! /Helin -----Original Message----- From: Martin Weiser [mailto:martin.weiser@allegro-packets.com] Sent: Friday, January 6, 2017 5:17 PM To: dev@dpdk.org; Ilya Maximets <i.maximets@samsung.com> Cc: Zhang, Helin <helin.zhang@intel.com>; Wu, Jingjing <jingjing.wu@intel.com> Subject: Re: [dpdk-dev] XL710 with i40e driver drops packets on RX even on a small rates. Hello, just to let you know we were finally able to resolve the issue. It seems that the affected boards had a firmware issue with PCIe x8 v3. When we forced the PCI slots to run at x8 v2 the issue disappeared for Test 1 and Test 2. Test 3 still produced missed packets but probably due to the reduced PCIe x8 v2 bandwidth. We then found out that there exists a BIOS/firmware update for these boards which was issued by Supermicro in November ... unfortunately there are no changenotes whatsoever. But lo and behold this update seems to include a fix for exactly this issue since now the XL710 is working as expected with PCIe x8 v3. Best regards, Martin On 04.01.17 13:33, Martin Weiser wrote: > Hello, > > I have performed some more thorough testing on 3 different machines to > illustrate the strange results with XL710. > Please note that all 3 systems were able to forward the traffic of > Test > 1 and Test 2 without packet loss when a 82599ES NIC was installed in > the same PCI slot as the XL710 in the tests below. > > Here is the test setup and the test results: > > > ## Test traffic > > In all tests the t-rex traffic generator was used to generate traffic > on a XL710 card with the following parameters: > > ### Test 1 > > ./t-rex-64 -f cap2/imix_1518.yaml -c 4 -d 60 -m 25 --flip > > This resulted in a 60 second run with ~1.21 Gbps traffic on each of > the two interfaces with ~100000 packets per second on each interface. > > ### Test 2 > > ./t-rex-64 -f cap2/imix_1518.yaml -c 4 -d 60 -m 100 --flip > > This resulted in a 60 second run with ~4.85 Gbps traffic on each of > the two interfaces with ~400000 packets per second on each interface. > > ### Test 3 > > ./t-rex-64 -f cap2/imix_1518.yaml -c 4 -d 60 -m 400 --flip > > This resulted in a 60 second run with ~19.43 Gbps traffic on each of > the two interfaces with ~1600000 packets per second on each interface. > > > > ## DPDK > > On all systems a vanilla DPDK v16.11 testpmd was used with the > following parameters (PCI IDs differed between systems): > > ./build/app/testpmd -l 1,2 -w 0000:06:00.0 -w 0000:06:00.1 -- -i > > > > ## System 1 > > * Board: Supermicro X10SDV-TP8F > * CPU: > Architecture: x86_64 > CPU op-mode(s): 32-bit, 64-bit > Byte Order: Little Endian > CPU(s): 8 > On-line CPU(s) list: 0-7 > Thread(s) per core: 2 > Core(s) per socket: 4 > Socket(s): 1 > NUMA node(s): 1 > Vendor ID: GenuineIntel > CPU family: 6 > Model: 86 > Model name: Intel(R) Xeon(R) CPU D-1518 @ 2.20GHz > Stepping: 3 > CPU MHz: 800.250 > CPU max MHz: 2200.0000 > CPU min MHz: 800.0000 > BogoMIPS: 4399.58 > Virtualization: VT-x > L1d cache: 32K > L1i cache: 32K > L2 cache: 256K > L3 cache: 6144K > NUMA node0 CPU(s): 0-7 > Flags: fpu vme de pse tsc msr pae mce cx8 apic > sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss > ht tm pbe syscall nx pdpe1gb rdtscp lm constant_tsc arch_perfmon pebs > bts rep_good nopl xtopology nonstop_tsc aperfmperf eagerfpu pni > pclmulqdq > dtes64 monitor ds_cpl vmx smx est tm2 ssse3 sdbg fma cx16 xtpr pdcm > pcid dca sse4_1 sse4_2 x2apic movbe popcnt tsc_deadline_timer aes > xsave avx f16c rdrand lahf_lm abm 3dnowprefetch epb intel_pt > tpr_shadow vnmi flexpriority ept vpid fsgsbase tsc_adjust bmi1 hle > avx2 smep bmi2 erms invpcid rtm cqm rdseed adx smap xsaveopt cqm_llc > cqm_occup_llc cqm_mbm_total cqm_mbm_local dtherm arat pln pts > * Memory channels: 2 > * Memory: 2 * 8192 MB DDR4 @ 2133 MHz > * NIC firmware: FW 5.0 API 1.5 NVM 05.00.04 eetrack 80002505 > * i40e version: 1.4.25-k > * OS: Ubuntu 16.04.1 LTS > * Kernel: 4.4.0-57-generic > * Kernel parameters: isolcpus=1,2,3,5,6,7 default_hugepagesz=1G > hugepagesz=1G hugepages=1 > > ### Test 1 > > Mostly no packet loss. Sometimes ~10 packets missed of ~600000 on each > interface when testpmd was not started in interactive mode. > > ### Test 2 > > 100-300 packets of ~24000000 missed on each interface. > > ### Test 3 > > 4000-5000 packets of ~96000000 missed on each interface. > > > > ## System 2 > > * Board: Supermicro X10SDV-7TP8F > * CPU: > Architecture: x86_64 > CPU op-mode(s): 32-bit, 64-bit > Byte Order: Little Endian > CPU(s): 32 > On-line CPU(s) list: 0-31 > Thread(s) per core: 2 > Core(s) per socket: 16 > Socket(s): 1 > NUMA node(s): 1 > Vendor ID: GenuineIntel > CPU family: 6 > Model: 86 > Model name: 06/56 > Stepping: 4 > CPU MHz: 1429.527 > CPU max MHz: 2300.0000 > CPU min MHz: 800.0000 > BogoMIPS: 3400.37 > Virtualization: VT-x > L1d cache: 32K > L1i cache: 32K > L2 cache: 256K > L3 cache: 24576K > NUMA node0 CPU(s): 0-31 > Flags: fpu vme de pse tsc msr pae mce cx8 apic > sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss > ht tm pbe syscall nx pdpe1gb rdtscp lm constant_tsc arch_perfmon pebs > bts rep_good nopl xtopology nonstop_tsc aperfmperf eagerfpu pni > pclmulqdq > dtes64 monitor ds_cpl vmx smx est tm2 ssse3 sdbg fma cx16 xtpr pdcm > pcid dca sse4_1 sse4_2 x2apic movbe popcnt tsc_deadline_timer aes > xsave avx f16c rdrand lahf_lm abm 3dnowprefetch epb intel_pt > tpr_shadow vnmi flexpriority ept vpid fsgsbase tsc_adjust bmi1 hle > avx2 smep bmi2 erms invpcid rtm cqm rdseed adx smap xsaveopt cqm_llc > cqm_occup_llc cqm_mbm_total cqm_mbm_local dtherm ida arat pln pts > * Memory channels: 2 > * Memory: 4 * 16384 MB DDR4 @ 2133 MHz > * NIC firmware: FW 5.0 API 1.5 NVM 05.00.04 eetrack 80002505 > * i40e version: 1.4.25-k > * OS: Ubuntu 16.04.1 LTS > * Kernel: 4.4.0-57-generic > * Kernel parameters: > isolcpus=1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,17,18,19,20,21,22,23,24,2 > 5,26,27,28,29,30,31 default_hugepagesz=1G hugepagesz=1G hugepages=1 > > ### Test 1 > > Mostly no packet loss of ~600000. > > ### Test 2 > > 400000-500000 packets of ~24000000 missed on each interface. > > ### Test 3 > > 1200000-1400000 packets of ~96000000 missed on each interface. > > > > ## System 3 > > * Board: Supermicro X9SRW-F > * CPU: > Architecture: x86_64 > CPU op-mode(s): 32-bit, 64-bit > Byte Order: Little Endian > CPU(s): 12 > On-line CPU(s) list: 0-11 > Thread(s) per core: 2 > Core(s) per socket: 6 > Socket(s): 1 > NUMA node(s): 1 > Vendor ID: GenuineIntel > CPU family: 6 > Model: 62 > Model name: Intel(R) Xeon(R) CPU E5-1650 v2 @ 3.50GHz > Stepping: 4 > CPU MHz: 1200.253 > CPU max MHz: 3900.0000 > CPU min MHz: 1200.0000 > BogoMIPS: 7000.29 > Virtualization: VT-x > L1d cache: 32K > L1i cache: 32K > L2 cache: 256K > L3 cache: 12288K > NUMA node0 CPU(s): 0-11 > Flags: fpu vme de pse tsc msr pae mce cx8 apic > sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss > ht tm pbe syscall nx pdpe1gb rdtscp lm constant_tsc arch_perfmon pebs > bts rep_good nopl xtopology nonstop_tsc aperfmperf eagerfpu pni > pclmulqdq > dtes64 monitor ds_cpl vmx smx est tm2 ssse3 cx16 xtpr pdcm pcid dca > sse4_1 sse4_2 x2apic popcnt tsc_deadline_timer aes xsave avx f16c > rdrand lahf_lm epb tpr_shadow vnmi flexpriority ept vpid fsgsbase smep > erms xsaveopt dtherm arat pln pts > * Memory channels: 4 > * Memory: 4 * 8192 MB DDR3 @ 1600 MHz > * NIC firmware: FW 5.0 API 1.5 NVM 05.00.04 eetrack 80002537 > * i40e version: 1.4.25-k > * OS: Ubuntu 16.04.1 LTS > * Kernel: 4.4.0-57-generic > * Kernel parameters: default_hugepagesz=1G hugepagesz=1G hugepages=1 > isolcpus=1-5,7-11 > > ### Test 1 > > No packets lost. > > ### Test 2 > > No packets lost. > > ### Test 3 > > No packets lost. > > > > Best regards, > Martin > > > > On 03.01.17 13:18, Martin Weiser wrote: >> Hello, >> >> we are also seeing this issue on one of our test systems while it >> does not occur on other test systems with the same DPDK version (we >> tested >> 16.11 and current master). >> >> The system that we can reproduce this issue on also has a X552 ixgbe >> NIC which can forward the exact same traffic using the same testpmd >> parameters without a problem. >> Even if we install a 82599ES ixgbe NIC in the same PCI slot that the >> XL710 was in the 82599ES can forward the traffic without any drops. >> >> Like in the issue reported by Ilya all packet drops occur on the >> testpmd side and are accounted as 'imissed'. Increasing the number of >> rx descriptors only helps a little at low packet rates. >> >> Drops start occurring at pretty low packet rates like 100000 packets >> per second. >> >> Any suggestions would be greatly appreciated. >> >> Best regards, >> Martin >> >> >> >> On 22.08.16 14:06, Ilya Maximets wrote: >>> Hello, All. >>> >>> I've faced with a really bad situation with packet drops on a small >>> packet rates (~45 Kpps) while using XL710 NIC with i40e DPDK driver. >>> >>> The issue was found while testing PHY-VM-PHY scenario with OVS and >>> confirmed on PHY-PHY scenario with testpmd. >>> >>> DPDK version 16.07 was used in all cases. >>> XL710 firmware-version: f5.0.40043 a1.5 n5.04 e2505 >>> >>> Test description (PHY-PHY): >>> >>> * Following cmdline was used: >>> >>> # n_desc=2048 >>> # ./testpmd -c 0xf -n 2 --socket-mem=8192,0 -w 0000:05:00.0 -v \ >>> -- --burst=32 --txd=${n_desc} --rxd=${n_desc} \ >>> --rxq=1 --txq=1 --nb-cores=1 \ >>> --eth-peer=0,a0:00:00:00:00:00 --forward-mode=mac >>> >>> * DPDK-Pktgen application was used as a traffic generator. >>> Single flow generated. >>> >>> Results: >>> >>> * Packet size: 128B, rate: 90% of 10Gbps (~7.5 Mpps): >>> >>> On the generator's side: >>> >>> Total counts: >>> Tx : 759034368 packets >>> Rx : 759033239 packets >>> Lost : 1129 packets >>> >>> Average rates: >>> Tx : 7590344 pps >>> Rx : 7590332 pps >>> Lost : 11 pps >>> >>> All of this dropped packets are RX-dropped on testpmd's side: >>> >>> +++++++++++++++ Accumulated forward statistics for all ports+++++++++++++++ >>> RX-packets: 759033239 RX-dropped: 1129 RX-total: 759034368 >>> TX-packets: 759033239 TX-dropped: 0 TX-total: 759033239 >>> >>> ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ >>> +++++++ >>> >>> At the same time 10G NIC with IXGBE driver works perfectly >>> without any packet drops in the same scenario. >>> >>> Much worse situation with PHY-VM-PHY scenario with OVS: >>> >>> * testpmd application used inside guest to forward incoming packets. >>> (almost same cmdline as for PHY-PHY) >>> >>> * For packet size 256 B on rate 1% of 10Gbps (~45 Kpps): >>> >>> Total counts: >>> Tx : 1358112 packets >>> Rx : 1357990 packets >>> Lost : 122 packets >>> >>> Average rates: >>> Tx : 45270 pps >>> Rx : 45266 pps >>> Lost : 4 pps >>> >>> All of this 122 dropped packets can be found in rx_dropped counter: >>> >>> # ovs-vsctl get interface dpdk0 statistics:rx_dropped >>> 122 >>> >>> And again, no issues with IXGBE on the exactly same scenario. >>> >>> >>> Results of my investigation: >>> >>> * I found that all of this packets are 'imissed'. This means that rx >>> descriptor ring was overflowed. >>> >>> * I've modified i40e driver to check the real number of free descriptors >>> that was not still filled by the NIC and found that HW fills >>> rx descriptors with uneven rate. Looks like it fills them using >>> a huge batches. >>> >>> * So, root cause of packet drops with XL710 is somehow uneven rate of >>> filling of the hw rx descriptors by the NIC. This leads to exhausting >>> of rx descriptors and packet drops by the hardware. 10G IXGBE NIC works >>> more smoothly and driver is able to refill hw ring with rx descriptors >>> in time. >>> >>> * The issue becomes worse with OVS because of much bigger latencies >>> between 'rte_eth_rx_burst()' calls. >>> >>> The easiest solution for this problem is to increase number of RX descriptors. >>> Increasing up to 4096 eliminates packet drops but decreases the performance a lot: >>> >>> For OVS PHY-VM-PHY scenario by 10% >>> For OVS PHY-PHY scenario by 20% >>> For tespmd PHY-PHY scenario by 17% (22.1 Mpps --> 18.2 Mpps for 64B >>> packets) >>> >>> As a result we have a trade-off between zero drop rate on small >>> packet rates and the higher maximum performance that is very sad. >>> >>> Using of 16B descriptors doesn't really help with performance. >>> Upgrading the firmware from version 4.4 to 5.04 didn't help with drops. >>> >>> Any thoughts? Can anyone reproduce this? >>> >>> Best regards, Ilya Maximets. ^ permalink raw reply [flat|nested] 5+ messages in thread
end of thread, other threads:[~2017-01-06 9:45 UTC | newest] Thread overview: 5+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2016-08-22 12:06 XL710 with i40e driver drops packets on RX even on a small rates Ilya Maximets 2017-01-03 12:18 ` Martin Weiser 2017-01-04 12:33 ` Martin Weiser 2017-01-06 9:17 ` Martin Weiser 2017-01-06 9:45 ` Zhang, Helin
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).