From mboxrd@z Thu Jan 1 00:00:00 1970 From: Jesper Dangaard Brouer Subject: Re: [PATCH v2 net-next 0/4] udp: receive path optimizations Date: Thu, 8 Dec 2016 22:17:16 +0100 Message-ID: <20161208221716.25a05a9b@redhat.com> References: <1481218739-27089-1-git-send-email-edumazet@google.com> <20161208214819.30138d12@redhat.com> Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit Cc: brouer@redhat.com, "David S . Miller" , netdev , Paolo Abeni , Eric Dumazet To: Eric Dumazet Return-path: Received: from mx1.redhat.com ([209.132.183.28]:39072 "EHLO mx1.redhat.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752510AbcLHVRU (ORCPT ); Thu, 8 Dec 2016 16:17:20 -0500 In-Reply-To: <20161208214819.30138d12@redhat.com> Sender: netdev-owner@vger.kernel.org List-ID: On Thu, 8 Dec 2016 21:48:19 +0100 Jesper Dangaard Brouer wrote: > On Thu, 8 Dec 2016 09:38:55 -0800 > Eric Dumazet wrote: > > > This patch series provides about 100 % performance increase under flood. > > Could you please explain a bit more about what kind of testing you are > doing that can show 100% performance improvement? > > I've tested this patchset and my tests show *huge* speeds ups, but > reaping the performance benefit depend heavily on setup and enabling > the right UDP socket settings, and most importantly where the > performance bottleneck is: ksoftirqd(producer) or udp_sink(consumer). > > Basic setup: Unload all netfilter, and enable ip_early_demux. > sysctl net/ipv4/ip_early_demux=1 > > Test generator pktgen UDP packets single flow, 50Gbit/s mlx5 NICs. > - Vary packet size between 64 and 1514. Below, I've added the baseline tests. Baseline test on net-next at commit c9fba3ed3a4 > Packet-size: 64 > $ sudo taskset -c 4 ./udp_sink --port 9 --count $((10**7)) > ns/pkt pps cycles/pkt > recvMmsg/32 run: 0 10000000 537.70 1859756.90 2155 > recvmsg run: 0 10000000 510.84 1957541.83 2047 > read run: 0 10000000 583.40 1714077.14 2338 > recvfrom run: 0 10000000 600.09 1666411.49 2405 Packet-size: 64 (baseline) $ sudo taskset -c 4 ./udp_sink --port 9 --count $((10**7)) recvMmsg/32 run: 0 10000000 499.75 2001016.09 2003 recvmsg run: 0 10000000 455.84 2193740.92 1827 read run: 0 10000000 566.99 1763703.49 2272 recvfrom run: 0 10000000 581.02 1721098.87 2328 > The ksoftirq thread "cost" more than udp_sink, which is idle, and UDP > queue does not get full-enough. Thus, patchset does not have any > effect. > > > Try to increase pktgen packet size, as this increase the copy cost of > udp_sink. Thus, a queue can now form, and udp_sink CPU almost have no > idle cycles. The "read" and "readfrom" did experience some idle > cycles. > > Packet-size: 1514 > $ sudo taskset -c 4 ./udp_sink --port 9 --count $((10**7)) > ns/pkt pps cycles/pkt > recvMmsg/32 run: 0 10000000 435.88 2294204.11 1747 > recvmsg run: 0 10000000 458.06 2183100.64 1835 > read run: 0 10000000 520.34 1921826.18 2085 > recvfrom run: 0 10000000 515.48 1939935.27 2066 Packet-size: 1514 (baseline) $ sudo taskset -c 4 ./udp_sink --port 9 --count $((10**7)) recvMmsg/32 run: 0 10000000 453.88 2203231.81 1819 recvmsg run: 0 10000000 488.31 2047869.13 1957 read run: 0 10000000 480.99 2079058.69 1927 recvfrom run: 0 10000000 522.64 1913349.26 2094 > Next trick connected UDP: > > Use connected UDP socket (combined with ip_early_demux), removes the > FIB_lookup from the ksoftirq, and cause tipping point to be better. > > Packet-size: 64 > $ sudo taskset -c 4 ./udp_sink --port 9 --count $((10**7)) --connect > ns/pkt pps cycles/pkt > recvMmsg/32 run: 0 10000000 391.18 2556361.62 1567 > recvmsg run: 0 10000000 422.95 2364349.69 1695 > read run: 0 10000000 425.29 2351338.10 1704 > recvfrom run: 0 10000000 476.74 2097577.57 1910 Packet-size: 64 (baseline) $ sudo taskset -c 4 ./udp_sink --port 9 --count $((10**7)) --connect recvMmsg/32 run: 0 10000000 438.55 2280255.77 1757 recvmsg run: 0 10000000 496.73 2013156.99 1990 read run: 0 10000000 412.17 2426170.58 1652 recvfrom run: 0 10000000 471.77 2119662.99 1890 > Change/increase packet size: > > Packet-size: 1514 > $ sudo taskset -c 4 ./udp_sink --port 9 --count $((10**7)) --connect > ns/pkt pps cycles/pkt > recvMmsg/32 run: 0 10000000 457.56 2185481.94 1833 > recvmsg run: 0 10000000 479.42 2085837.49 1921 > read run: 0 10000000 398.05 2512233.13 1595 > recvfrom run: 0 10000000 391.07 2557096.95 1567 Packet-size: 1514 (baseline) $ sudo taskset -c 4 ./udp_sink --port 9 --count $((10**7)) --connect recvMmsg/32 run: 0 10000000 491.11 2036205.63 1968 recvmsg run: 0 10000000 514.37 1944138.31 2061 read run: 0 10000000 444.02 2252147.84 1779 recvfrom run: 0 10000000 426.58 2344247.20 1709 > A bit strange, changing the packet size, flipped what is the fastest > syscall. > > It is also interesting to see that ksoftirq limit is: > > Result from "nstat" while using recvmsg, show that ksoftirq is > handling 2.6 Mpps, and consumer/udp_sink is bottleneck with 2Mpps. > > [skylake ~]$ nstat > /dev/null && sleep 1 && nstat > #kernel > IpInReceives 2667577 0.0 > IpInDelivers 2667577 0.0 > UdpInDatagrams 2083580 0.0 > UdpInErrors 583995 0.0 > UdpRcvbufErrors 583995 0.0 > IpExtInOctets 4001340000 0.0 > IpExtInNoECTPkts 2667559 0.0 (baseline 1514 bytes recvmsg) $ nstat > /dev/null && sleep 1 && nstat #kernel IpInReceives 2702424 0.0 IpInDelivers 2702423 0.0 UdpInDatagrams 1950184 0.0 UdpInErrors 752239 0.0 UdpRcvbufErrors 752239 0.0 IpExtInOctets 4053642000 0.0 IpExtInNoECTPkts 2702428 0.0 -- Best regards, Jesper Dangaard Brouer MSc.CS, Principal Kernel Engineer at Red Hat LinkedIn: http://www.linkedin.com/in/brouer