All of lore.kernel.org
 help / color / mirror / Atom feed
From: Ingo Molnar <mingo@elte.hu>
To: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
Cc: linux-kernel@vger.kernel.org, netdev@vger.kernel.org,
	netfilter-devel@vger.kernel.org, akpm@linux-foundation.org,
	torvalds@linux-foundation.org, davem@davemloft.net,
	dada1@cosmosbay.com, zbr@ioremap.net, jeff.chua.linux@gmail.com,
	paulus@samba.org, laijs@cn.fujitsu.com, jengelh@medozas.de,
	r000n@r000n.net, benh@kernel.crashing.org,
	mathieu.desnoyers@polymtl.ca
Subject: Re: [PATCH RFC] v5 expedited "big hammer" RCU grace periods
Date: Tue, 19 May 2009 14:44:36 +0200	[thread overview]
Message-ID: <20090519124436.GA6238@elte.hu> (raw)
In-Reply-To: <20090519123316.GA7159@linux.vnet.ibm.com>


* Paul E. McKenney <paulmck@linux.vnet.ibm.com> wrote:

> On Tue, May 19, 2009 at 10:58:25AM +0200, Ingo Molnar wrote:
> > 
> > * Paul E. McKenney <paulmck@linux.vnet.ibm.com> wrote:
> > 
> > > On Mon, May 18, 2009 at 05:42:41PM +0200, Ingo Molnar wrote:
> > > > 
> > > > * Paul E. McKenney <paulmck@linux.vnet.ibm.com> wrote:
> > > > 
> > > > > > i might be missing something fundamental here, but why not just 
> > > > > > have per CPU helper threads, all on the same waitqueue, and wake 
> > > > > > them up via a single wake_up() call? That would remove the SMP 
> > > > > > cross call (wakeups do immediate cross-calls already).
> > > > > 
> > > > > My concern with this is that the cache misses accessing all the 
> > > > > processes on this single waitqueue would be serialized, slowing 
> > > > > things down. In contrast, the bitmask that smp_call_function() 
> > > > > traverses delivers on the order of a thousand CPUs' worth of bits 
> > > > > per cache miss.  I will give it a try, though.
> > > > 
> > > > At least if you go via the migration threads, you can queue up 
> > > > requests to them locally. But there's going to be cachemisses 
> > > > _anyway_, since you have to access them all from a single CPU, 
> > > > and then they have to fetch details about what to do, and then 
> > > > have to notify the originator about completion.
> > > 
> > > Ah, so you are suggesting that I use smp_call_function() to run 
> > > code on each CPU that wakes up that CPU's migration thread?  I 
> > > will take a look at this.
> > 
> > My suggestion was to queue up a dummy 'struct migration_req' up with 
> > it (change migration_req::task == NULL to mean 'nothing') and simply 
> > wake it up using wake_up_process().
> 
> OK.  I was thinking of just using wake_up_process() without the
> migration_req structure, and unconditionally setting a per-CPU
> variable from within migration_thread() just before the list_empty()
> check.  In your approach we would need a NULL-pointer check just
> before the call to __migrate_task().
> 
> > That will force a quiescent state, without the need for any extra 
> > information, right?
> 
> Yep!
> 
> > This is what the scheduler code does, roughly:
> > 
> >                 wake_up_process(rq->migration_thread);
> >                 wait_for_completion(&req.done);
> > 
> > and this will always have to perform well. The 'req' could be put 
> > into PER_CPU, and a loop could be done like this:
> > 
> > 	for_each_online_cpu(cpu)
> >                 wake_up_process(cpu_rq(cpu)->migration_thread);
> > 
> > 	for_each_online_cpu(cpu)
> >                 wait_for_completion(&per_cpu(req, cpu).done);
> > 
> > hm?
> 
> My concern is the linear slowdown for large systems, but this 
> should be OK for modest systems (a few 10s of CPUs).  However, I 
> will try it out -- it does not need to be a long-term solution, 
> after all.

I think there is going to be a linear slowdown no matter what - 
because sending that many IPIs is going to be linear. (there are no 
'broadcast to all' IPIs anymore - on x86 we only have them if all 
physical APIC IDs are 7 or smaller.)

Also, no matter what scheme we use, the target CPU does have to be 
processed somehow and it does have to signal completion back somehow 
- which generates cachemisses.

I think what probaby matters most is to go simple, and to use 
established kernel primitives - and the above is really typical 
pattern for things like TLB flushes to a process having a presence 
on every physical CPU. Those aspects will be kept reasonably fast 
and balanced on all hardware that matters. (and if not, people will 
notice any TLB flush/shootdown linear slowdowns and will address it)

I could be wrong though ... maybe someone can get some numbers from 
a really large system?

	Ingo

  reply	other threads:[~2009-05-19 12:44 UTC|newest]

Thread overview: 19+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2009-05-17 19:11 [PATCH RFC] v5 expedited "big hammer" RCU grace periods Paul E. McKenney
2009-05-17 20:02 ` Evgeniy Polyakov
2009-05-17 22:08   ` Paul E. McKenney
2009-05-18  6:59 ` Lai Jiangshan
2009-05-18 14:40   ` Paul E. McKenney
2009-05-18  7:56 ` Ingo Molnar
2009-05-18 15:14   ` Paul E. McKenney
2009-05-18 15:42     ` Ingo Molnar
2009-05-18 16:02       ` Paul E. McKenney
2009-05-19  8:58         ` Ingo Molnar
2009-05-19 12:33           ` Paul E. McKenney
2009-05-19 12:44             ` Ingo Molnar [this message]
2009-05-19 16:18               ` Paul E. McKenney
2009-05-20  8:09                 ` Ingo Molnar
2009-05-20 15:30                   ` Paul E. McKenney
2009-05-27 22:57                     ` Ingo Molnar
2009-05-29  1:22                       ` Paul E. McKenney
2009-05-29 12:06                         ` Gautham R Shenoy
2009-05-30  4:56                           ` Paul E. McKenney

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20090519124436.GA6238@elte.hu \
    --to=mingo@elte.hu \
    --cc=akpm@linux-foundation.org \
    --cc=benh@kernel.crashing.org \
    --cc=dada1@cosmosbay.com \
    --cc=davem@davemloft.net \
    --cc=jeff.chua.linux@gmail.com \
    --cc=jengelh@medozas.de \
    --cc=laijs@cn.fujitsu.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mathieu.desnoyers@polymtl.ca \
    --cc=netdev@vger.kernel.org \
    --cc=netfilter-devel@vger.kernel.org \
    --cc=paulmck@linux.vnet.ibm.com \
    --cc=paulus@samba.org \
    --cc=r000n@r000n.net \
    --cc=torvalds@linux-foundation.org \
    --cc=zbr@ioremap.net \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.