From mboxrd@z Thu Jan 1 00:00:00 1970 From: Eric Dumazet Subject: Re: [PATCH v6] net: batch skb dequeueing from softnet input_pkt_queue Date: Thu, 29 Apr 2010 19:56:12 +0200 Message-ID: <1272563772.2222.301.camel@edumazet-laptop> References: <1272010378-2955-1-git-send-email-xiaosuo@gmail.com> <1272014825.7895.7851.camel@edumazet-laptop> <1272060153.8918.8.camel@bigi> <1272118252.8918.13.camel@bigi> <1272290584.19143.43.camel@edumazet-laptop> <1272293707.19143.51.camel@edumazet-laptop> <20100429174056.GA8044@gargoyle.fritz.box> Mime-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: QUOTED-PRINTABLE Cc: hadi@cyberus.ca, Changli Gao , "David S. Miller" , Tom Herbert , Stephen Hemminger , netdev@vger.kernel.org, Andi Kleen To: Andi Kleen Return-path: Received: from mail-bw0-f219.google.com ([209.85.218.219]:43132 "EHLO mail-bw0-f219.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S933936Ab0D3SrE (ORCPT ); Fri, 30 Apr 2010 14:47:04 -0400 Received: by mail-bw0-f219.google.com with SMTP id 19so311923bwz.21 for ; Fri, 30 Apr 2010 11:47:02 -0700 (PDT) In-Reply-To: <20100429174056.GA8044@gargoyle.fritz.box> Sender: netdev-owner@vger.kernel.org List-ID: Le jeudi 29 avril 2010 =C3=A0 19:42 +0200, Andi Kleen a =C3=A9crit : > > Andi, what do you think of this one ? > > Dont we have a function to send an IPI to an individual cpu instead= ? >=20 > That's what this function already does. You only set a single CPU=20 > in the target mask, right? >=20 > IPIs are unfortunately always a bit slow. Nehalem-EX systems have X2A= PIC > which is a bit faster for this, but that's not available in the lower > end Nehalems. But even then it's not exactly fast. >=20 > I don't think the IPI primitive can be optimized much. It's not a che= ap=20 > operation. >=20 > If it's a problem do it less often and batch IPIs. >=20 > It's essentially the same problem as interrupt mitigation or NAPI=20 > are solving for NICs. I guess just need a suitable mitigation mechani= sm. >=20 > Of course that would move more work to the sending CPU again, but=20 > perhaps there's no alternative. I guess you could make it cheaper it = by > minimizing access to packet data. >=20 > -Andi Well, IPI are already batched, and rate is auto adaptative. After various changes, it seems things are going better, maybe there is something related to cache line trashing. I 'solved' it by using idle=3Dpoll, but you might take a look at clockevents_notify (acpi_idle_enter_bm) abuse of a shared and higly contended spinlock... 23.52% init [kernel.kallsyms] [k] _raw_spin= _lock_irqsave | --- _raw_spin_lock_irqsave | =20 |--94.74%-- clockevents_notify | lapic_timer_state_broadcast | acpi_idle_enter_bm | cpuidle_idle_call | cpu_idle | start_secondary | =20 |--4.10%-- tick_broadcast_oneshot_control | tick_notify | notifier_call_chain | __raw_notifier_call_chain | raw_notifier_call_chain | clockevents_do_notify | clockevents_notify | lapic_timer_state_broadcast | acpi_idle_enter_bm | cpuidle_idle_call | cpu_idle | start_secondary | =20