From mboxrd@z Thu Jan 1 00:00:00 1970 From: Eric Dumazet Subject: Re: [PATCH net-next-2.6] net: speedup udp receive path Date: Thu, 29 Apr 2010 14:45:08 +0200 Message-ID: <1272545108.2222.65.camel@edumazet-laptop> References: <1272010378-2955-1-git-send-email-xiaosuo@gmail.com> <20100427.150817.84390202.davem@davemloft.net> <1272406693.2343.26.camel@edumazet-laptop> <1272454432.14068.4.camel@bigi> <1272458001.2267.0.camel@edumazet-laptop> <1272458174.14068.16.camel@bigi> <1272463605.2267.70.camel@edumazet-laptop> <1272498293.4258.121.camel@bigi> <1272514176.2201.85.camel@edumazet-laptop> <1272540952.4258.161.camel@bigi> Mime-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: QUOTED-PRINTABLE Cc: hadi@cyberus.ca, David Miller , therbert@google.com, shemminger@vyatta.com, netdev@vger.kernel.org, Eilon Greenstein , Brian Bloniarz To: Changli Gao Return-path: Received: from mail-bw0-f219.google.com ([209.85.218.219]:49701 "EHLO mail-bw0-f219.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1756236Ab0D3UQp (ORCPT ); Fri, 30 Apr 2010 16:16:45 -0400 Received: by bwz19 with SMTP id 19so352237bwz.21 for ; Fri, 30 Apr 2010 13:16:42 -0700 (PDT) In-Reply-To: Sender: netdev-owner@vger.kernel.org List-ID: Le jeudi 29 avril 2010 =C3=A0 20:12 +0800, Changli Gao a =C3=A9crit : > On Thu, Apr 29, 2010 at 7:35 PM, jamal wrote: > > > > Same here - even in my worst case scenario 88.5% of 750Kpps > 600Kp= ps. > > Attached is history results to make more sense of what i am saying: > > we have net-next kernels from apr14, apr23, apr23 with changlis cha= nge, > > apr28, apr28 with your change. What you'll see is non-rps (blue) ge= ts > > better and rps (Orange) gets better slowly then by apr28 it is wors= e. >=20 > Did the number of IPIs increase in the apr28 test? The finial patch > with Eric's change may introduce more IPIs. And I am wondering why > 23rdcl-non-rps is better than before. Maybe it is the side effect of > my patch: enlarge the netdev_max_backlog. >=20 >=20 Changli, I wonder how you can cook "performance" patches without testin= g them at all for real... This cannot be true ? When the cpu doing the device softirq is flooded, it handles 300 packet= s per net_rx_action() round (netdev_budget), so sends at most 6 ipis per 300 packets, with or without my patch, with or without your patch as well. (At most because if remote cpus are flooded as well, they dont napi_complete so no IPI needed at all) (My patch had an effect only on normal load, ie one packet received in = a while... up to 50.000 pps I would say). And it also has a nice effect o= n non RPS loads (mostly the more typical load for following years). If a second packet comes 3us after the first one, and before 2nd CPU handled it, we _can_ afford an extra IPI. 750.000/50 =3D 15.000 IPI per second. Even with 200.000 IPI per second, 'perf top -C CPU_IPI_sender' shows that sending IPI is very cheap (maybe ~1% of cpu cycles) # Samples: 32033467127 # # Overhead Command Shared Object Symbol # ........ .............. ................. ...... # 18.05% init [kernel.kallsyms] [k] poll_idle 10.91% init [kernel.kallsyms] [k] bnx2x_rx_int 10.42% init [kernel.kallsyms] [k] eth_type_trans 5.72% init [kernel.kallsyms] [k] kmem_cache_alloc_nod= e 5.43% init [kernel.kallsyms] [k] __memset 5.20% init [kernel.kallsyms] [k] get_rps_cpu 4.82% init [kernel.kallsyms] [k] __slab_alloc 4.34% init [kernel.kallsyms] [k] get_partial_node 4.22% init [kernel.kallsyms] [k] _raw_spin_lock 3.41% init [kernel.kallsyms] [k] __kmalloc_node_track= _caller 3.01% init [kernel.kallsyms] [k] __alloc_skb 2.22% init [kernel.kallsyms] [k] enqueue_to_backlog 2.10% init [kernel.kallsyms] [k] vlan_gro_common 1.34% init [kernel.kallsyms] [k] swiotlb_map_page 1.25% init [kernel.kallsyms] [k] skb_put 1.06% init [kernel.kallsyms] [k] _raw_spin_lock_irqsa= ve 0.92% init [kernel.kallsyms] [k] dev_gro_receive 0.88% init [kernel.kallsyms] [k] swiotlb_dma_mapping_= error 0.83% init [kernel.kallsyms] [k] vlan_gro_receive 0.83% init [kernel.kallsyms] [k] __phys_addr 0.83% init [kernel.kallsyms] [k] __napi_complete 0.83% init [kernel.kallsyms] [k] default_send_IPI_mas= k_sequence_phys 0.77% init [kernel.kallsyms] [k] is_swiotlb_buffer 0.76% init [kernel.kallsyms] [k] __netdev_alloc_skb 0.74% init [kernel.kallsyms] [k] deactivate_slab 0.73% init [kernel.kallsyms] [k] netif_receive_skb 0.72% init [kernel.kallsyms] [k] unmap_single 0.69% init [kernel.kallsyms] [k] csd_lock 0.63% init [kernel.kallsyms] [k] bnx2x_poll 0.61% init [kernel.kallsyms] [k] bnx2x_msix_fp_int 0.59% init [kernel.kallsyms] [k] irq_entries_start 0.59% init [kernel.kallsyms] [k] swiotlb_sync_single 0.54% init [kernel.kallsyms] [k] get_slab 0.46% init [kernel.kallsyms] [k] napi_skb_finish