From mboxrd@z Thu Jan 1 00:00:00 1970 From: Alex Bligh Subject: Re: Scalability of interface creation and deletion Date: Sun, 08 May 2011 13:18:55 +0100 Message-ID: <7B76F9D75FD26D716624004B@nimrod.local> References: <891B02256A0667292521A4BF@Ximines.local> <1304770926.2821.1157.camel@edumazet-laptop> <0F4A638C2A523577CDBC295E@Ximines.local> <1304785589.3207.5.camel@edumazet-laptop> <178E8895FB84C07251538EF7@Ximines.local> <1304793174.3207.22.camel@edumazet-laptop> <1304793749.3207.26.camel@edumazet-laptop> <1304838742.3207.45.camel@edumazet-laptop> Reply-To: Alex Bligh Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii; format=flowed Content-Transfer-Encoding: 7bit Cc: netdev@vger.kernel.org, "Paul E. McKenney" , Alex Bligh To: Alex Bligh , Eric Dumazet Return-path: Received: from mail.avalus.com ([89.16.176.221]:32790 "EHLO mail.avalus.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752559Ab1EHMTB (ORCPT ); Sun, 8 May 2011 08:19:01 -0400 In-Reply-To: Content-Disposition: inline Sender: netdev-owner@vger.kernel.org List-ID: --On 8 May 2011 10:35:02 +0100 Alex Bligh wrote: > I suspect this may just mean an rcu reader holds the rcu_read_lock > for a jiffies related time. Though I'm having difficulty seeing > what that might be on a system where the net is in essence idle. Having read the RCU docs, this can't be right, because blocking is not legal when in the rcu_read_lock critical section. The system concerned is an 8 cpu system but I get comparable results on a 2 cpu system. I am guessing that when the synchronize_sched() happens, all cores but the cpu on which that is executing are idle (at least on the vast majority of calls) as the machine itself is idle. As I understand, RCU synchronization (in the absence of lots of callbacks etc.) is meant to wait until it knows all RCU read critical sections which are running on entry have been left. It exploits the fact that RCU read critical sections cannot block by waiting for a context switch on each cpu, OR for that cpu to be in the idle state or running user code (also incompatible with a read critical section). The fact that increasing HZ masks the problem seems to imply that sychronize_sched() is waiting when it shouldn't be, as it suggests it's waiting for a context switch. But surely it shouldn't be waiting for context switch if all other cpu cores are idle? It knows that it (the caller) doesn't hold an rcu_read_lock, and presumably can see the other cpus are in the idle state, in which case surely it should return immediately? Distribution of latency in synchronize_sched() looks like this: 20-49 us 110 instances (27.500%) 50-99 us 45 instances (11.250%) 5000-9999 us 5 instances (1.250%) 10000-19999 us 33 instances (8.250%) 20000-49999 us 4 instances (1.000%) 50000-99999 us 191 instances (47.750%) 100000-199999 us 12 instances (3.000%) -- Alex Bligh