From mboxrd@z Thu Jan 1 00:00:00 1970 From: Eric Dumazet Subject: Re: [PATCH v6] net: batch skb dequeueing from softnet input_pkt_queue Date: Mon, 03 May 2010 00:08:24 +0200 Message-ID: <1272838104.2173.166.camel@edumazet-laptop> References: <20100429214144.GA10663@gargoyle.fritz.box> <20100430.163857.180417789.davem@davemloft.net> <20100501110000.GB9434@gargoyle.fritz.box> <1272783366.2173.13.camel@edumazet-laptop> <20100502092020.GA9655@gargoyle.fritz.box> <1272797690.2173.26.camel@edumazet-laptop> <20100502154649.GA18063@gargoyle.fritz.box> <1272818131.2173.127.camel@edumazet-laptop> <20100502212550.GA2673@gargoyle.fritz.box> <1272836755.2173.153.camel@edumazet-laptop> <20100502215450.GC2673@gargoyle.fritz.box> Mime-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: QUOTED-PRINTABLE Cc: David Miller , hadi@cyberus.ca, xiaosuo@gmail.com, therbert@google.com, shemminger@vyatta.com, netdev@vger.kernel.org, lenb@kernel.org, arjan@infradead.org To: Andi Kleen Return-path: Received: from mail-bw0-f219.google.com ([209.85.218.219]:51083 "EHLO mail-bw0-f219.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752522Ab0EBWIc (ORCPT ); Sun, 2 May 2010 18:08:32 -0400 Received: by bwz19 with SMTP id 19so1009544bwz.21 for ; Sun, 02 May 2010 15:08:30 -0700 (PDT) In-Reply-To: <20100502215450.GC2673@gargoyle.fritz.box> Sender: netdev-owner@vger.kernel.org List-ID: Le dimanche 02 mai 2010 =C3=A0 23:54 +0200, Andi Kleen a =C3=A9crit : > On Sun, May 02, 2010 at 11:45:55PM +0200, Eric Dumazet wrote: > > Tests just prove the reverse. >=20 > What do you mean?=20 >=20 Test I did this week with Jamal. We first set a "ee" rps mask, because all NIC interrupts were handled b= y CPU0, and Jamal thought like you, that not using cpu4 would give better performance. But using "fe" mask gave me a bonus, from ~700.000 pps to ~800.000 pps CPU : E5450 @3.00GHz Two quad-core cpus in the machine, tg3 NIC. With RPS, CPU0 does not a lot of things, just talk with the NIC, bring = a few cache lines per packet and dispatch it to a slave cpu. > HT (especially Nehalem HT) is useful for a wide range of workloads. > Just handling network interrupts for its thread sibling is not one of= them. >=20 Thats the theory, now in practice I see different results. Of course, this might be related to hash distribution being different and more uniform. I should redo the test with many more flows.