Re: Scalability of interface creation and deletion

All of lore.kernel.org
 help / color / mirror / Atom feed

From: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
To: Alex Bligh <alex@alex.org.uk>
Cc: Eric Dumazet <eric.dumazet@gmail.com>, netdev@vger.kernel.org
Subject: Re: Scalability of interface creation and deletion
Date: Sun, 8 May 2011 05:50:28 -0700	[thread overview]
Message-ID: <20110508125028.GK2641@linux.vnet.ibm.com> (raw)
In-Reply-To: <7B76F9D75FD26D716624004B@nimrod.local>

On Sun, May 08, 2011 at 01:18:55PM +0100, Alex Bligh wrote:
> 
> 
> --On 8 May 2011 10:35:02 +0100 Alex Bligh <alex@alex.org.uk> wrote:
> 
> >I suspect this may just mean an rcu reader holds the rcu_read_lock
> >for a jiffies related time. Though I'm having difficulty seeing
> >what that might be on a system where the net is in essence idle.
> 
> Having read the RCU docs, this can't be right, because blocking
> is not legal when in the rcu_read_lock critical section.
> 
> The system concerned is an 8 cpu system but I get comparable
> results on a 2 cpu system.
> 
> I am guessing that when the synchronize_sched() happens, all cores
> but the cpu on which that is executing are idle (at least on
> the vast majority of calls) as the machine itself is idle.
> As I understand, RCU synchronization (in the absence of lots
> of callbacks etc.) is meant to wait until it knows all RCU
> read critical sections which are running on entry have
> been left. It exploits the fact that RCU read critical sections
> cannot block by waiting for a context switch on each cpu, OR
> for that cpu to be in the idle state or running user code (also
> incompatible with a read critical section).
> 
> The fact that increasing HZ masks the problem seems to imply that
> sychronize_sched() is waiting when it shouldn't be, as it suggests
> it's waiting for a context switch. But surely it shouldn't be
> waiting for context switch if all other cpu cores are idle?
> It knows that it (the caller) doesn't hold an rcu_read_lock,
> and presumably can see the other cpus are in the idle state,
> in which case surely it should return immediately? Distribution
> of latency in synchronize_sched() looks like this:
> 
> 20-49 us 110 instances (27.500%)
> 50-99 us 45 instances (11.250%)

Really?  I am having a hard time believing this above two.  Is this really
2000-4999 us and 5000-9999 us?  That would be much more believable,
and expected on a busy system with lots of context switching.  Or on a
system with CONFIG_NO_HZ=n.

> 5000-9999 us 5 instances (1.250%)

This makes sense for a mostly-idle system with frequent short bursts
of work.

> 10000-19999 us 33 instances (8.250%)

This makes sense for a CONFIG_NO_HZ system that is idle, where there
is some amount of background work that is also using RCU grace periods.

> 20000-49999 us 4 instances (1.000%)
> 50000-99999 us 191 instances (47.750%)
> 100000-199999 us 12 instances (3.000%)

These last involve additional delays.  Possibilities include long-running
irq handlers, SMIs, or NMIs.

								Thanx, Paul

next prev parent reply	other threads:[~2011-05-08 12:50 UTC|newest]

Thread overview: 53+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2011-05-07 11:08 Scalability of interface creation and deletion Alex Bligh
2011-05-07 12:22 ` Eric Dumazet
2011-05-07 15:26   ` Alex Bligh
2011-05-07 15:54     ` Eric Dumazet
2011-05-07 16:23       ` Ben Greear
2011-05-07 16:37         ` Eric Dumazet
2011-05-07 16:44           ` Ben Greear
2011-05-07 16:51             ` Eric Dumazet
2011-05-08  3:45               ` Ben Greear
2011-05-08  8:08                 ` Alex Bligh
2011-05-09 21:46       ` Octavian Purdila
2011-05-07 16:26     ` Eric Dumazet
2011-05-07 18:24       ` Alex Bligh
2011-05-07 18:32         ` Eric Dumazet
2011-05-07 18:39           ` Eric Dumazet
2011-05-08 10:09             ` Alex Bligh
2011-05-07 18:42           ` Eric Dumazet
2011-05-07 18:50             ` Alex Bligh
2011-05-08  7:12             ` Eric Dumazet
2011-05-08  8:06               ` Alex Bligh
2011-05-08  9:35               ` Alex Bligh
2011-05-08 12:18                 ` Alex Bligh
2011-05-08 12:50                   ` Paul E. McKenney [this message]
2011-05-08 13:13                     ` Alex Bligh
2011-05-08 13:44                       ` Paul E. McKenney
2011-05-08 14:27                         ` Alex Bligh
2011-05-08 14:47                           ` Paul E. McKenney
2011-05-08 15:17                             ` Alex Bligh
2011-05-08 15:48                               ` Paul E. McKenney
2011-05-08 21:00                                 ` Eric Dumazet
2011-05-09  4:44                                   ` [PATCH] veth: use batched device unregister Eric Dumazet
2011-05-09  6:56                                     ` Michał Mirosław
2011-05-09  8:20                                       ` Eric Dumazet
2011-05-09  9:17                                         ` [PATCH net-next-2.6] net: use batched device unregister in veth and macvlan Eric Dumazet
2011-05-09 18:42                                           ` David Miller
2011-05-09 19:05                                             ` Eric Dumazet
2011-05-09 20:17                                               ` Eric Dumazet
2011-05-10  6:40                                                 ` [PATCH net-2.6] vlan: fix GVRP at dismantle time Eric Dumazet
2011-05-10 19:23                                                   ` David Miller
2011-05-09  7:45                                     ` [PATCH v2 net-next-2.6] veth: use batched device unregister Eric Dumazet
2011-05-09  9:22                                       ` Eric Dumazet
2011-05-09  5:37                                   ` Scalability of interface creation and deletion Alex Bligh
2011-05-09  6:37                                     ` Eric Dumazet
2011-05-09  7:11                                   ` Paul E. McKenney
2011-05-09 17:30                                   ` Jesse Gross
2011-05-08 12:44                 ` Paul E. McKenney
2011-05-08 13:06                   ` Alex Bligh
2011-05-08 13:14                     ` Alex Bligh
2011-05-08 12:32               ` Paul E. McKenney
2011-05-07 18:51           ` Alex Bligh
2011-05-07 19:24             ` Eric Dumazet
2011-05-07 18:38       ` Alex Bligh
2011-05-07 18:44         ` Eric Dumazet

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20110508125028.GK2641@linux.vnet.ibm.com \
    --to=paulmck@linux.vnet.ibm.com \
    --cc=alex@alex.org.uk \
    --cc=eric.dumazet@gmail.com \
    --cc=netdev@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.