Re: [RFC PATCH] introduce sys_membarrier(): process-wide memory barrier (v5)

public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed

From: Mathieu Desnoyers <mathieu.desnoyers@polymtl.ca>
To: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Cc: linux-kernel@vger.kernel.org,
	"Paul E. McKenney" <paulmck@linux.vnet.ibm.com>,
	Steven Rostedt <rostedt@goodmis.org>,
	Oleg Nesterov <oleg@redhat.com>,
	Peter Zijlstra <peterz@infradead.org>,
	Ingo Molnar <mingo@elte.hu>,
	akpm@linux-foundation.org, josh@joshtriplett.org,
	tglx@linutronix.de, Valdis.Kletnieks@vt.edu, dhowells@redhat.com,
	laijs@cn.fujitsu.com, dipankar@in.ibm.com
Subject: Re: [RFC PATCH] introduce sys_membarrier(): process-wide memory barrier (v5)
Date: Wed, 13 Jan 2010 10:03:24 -0500	[thread overview]
Message-ID: <20100113150324.GE30875@Krystal> (raw)
In-Reply-To: <20100113130716.B3DC.A69D9226@jp.fujitsu.com>

* KOSAKI Motohiro (kosaki.motohiro@jp.fujitsu.com) wrote:
> > * KOSAKI Motohiro (kosaki.motohiro@jp.fujitsu.com) wrote:
[...]
> > > Why do we need both expedited and non-expedited mode? at least, this documentation
> > > is bad. it suggest "you have to use non-expedited mode always!".
> > 
> > Right. Maybe I should rather write:
> > 
> >  + * @expedited: (0) Low overhead, but slow execution (few milliseconds)
> >  + *             (1) Slightly higher overhead, fast execution (few microseconds)
> > 
> > And I could probably go as far as adding a few paragraphs:
> > 
> > Using the non-expedited mode is recommended for applications which can
> > afford leaving the caller thread waiting for a few milliseconds. A good
> > example would be a thread dedicated to execute RCU callbacks, which
> > waits for callbacks to enqueue most of the time anyway.
> > 
> > The expedited mode is recommended whenever the application needs to have
> > control returning to the caller thread as quickly as possible. An
> > example of such application would be one which uses the same thread to
> > perform data structure updates and issue the RCU synchronization.
> > 
> > It is perfectly safe to call both expedited and non-expedited
> > sys_membarriers in a process.
> > 
> > 
> > Does that help ?
> 
> Do librcu need both? I bet average programmer don't understand this
> explanation. please recall, syscall interface are used by non kernel
> developers too. If librcu only use either (0) or (1), I hope remove
> another one.
> 
> But if librcu really need both, the above explanation is enough good.
> I think.

As Paul said, we need both in liburcu. These usage scenarios are
explained in the system call documentation.

> 
> 
> > > > +	 * Memory barrier on the caller thread _before_ sending first
> > > > +	 * IPI. Matches memory barriers around mm_cpumask modification in
> > > > +	 * switch_mm().
> > > > +	 */
> > > > +	smp_mb();
> > > > +	if (!alloc_cpumask_var(&tmpmask, GFP_KERNEL)) {
> > > > +		membarrier_retry();
> > > > +		goto unlock;
> > > > +	}
> > > 
> > > if CONFIG_CPUMASK_OFFSTACK=1, alloc_cpumask_var call kmalloc. FWIW,
> > > kmalloc calling seems destory the worth of this patch.
> > 
> > Why ? I'm not sure I understand your point. Even if we call kmalloc to
> > allocate the cpumask, this is a constant overhead. The benefit of
> > smp_call_function_many() over smp_call_function_single() is that it
> > scales better by allowing to broadcast IPIs when the architecture
> > supports it. Or maybe I'm missing something ?
> 
> It depend on what mean "constant overhead". kmalloc might cause
> page reclaim and undeterministic delay. I'm not sure (1) How much
> membarrier_retry() slower than smp_call_function_many and (2) Which do
> you think important average or worst performance. Only I note I don't
> think GFP_KERNEL is constant overhead.

10,000,000 sys_membarrier calls (varying the number of threads to which
we send IPIs), IPI-to-many, 8-core system:

T=1: 0m20.173s
T=2: 0m20.506s
T=3: 0m22.632s
T=4: 0m24.759s
T=5: 0m26.633s
T=6: 0m29.654s
T=7: 0m30.669s

Just doing local mb()+single IPI to T other threads:

T=1: 0m18.801s
T=2: 0m29.086s
T=3: 0m46.841s
T=4: 0m53.758s
T=5: 1m10.856s
T=6: 1m21.142s
T=7: 1m38.362s

So sending single IPIs adds about 1.5 microseconds per extra core. With
the IPI-to-many scheme, we add about 0.2 microseconds per extra core. So
we have a factor 10 gain in scalability. The initial cost of the cpumask
allocation (which seems to be allocated on the stack in my config) is
just about 1.4 microseconds. So here, we only have a small gain for the
1 IPI case, which does not justify the added complexity of dealing with
it differently.

Also... it's pretty much a slow path anyway compared to the RCU
read-side. I just don't want this slow path to scale badly.

> 
> hmm...
> Do you intend to GFP_ATOMIC?

Would it help to lower the allocation overhead ?

Thanks,

Mathieu


-- 
Mathieu Desnoyers
OpenPGP key fingerprint: 8CD5 52C3 8E3C 4140 715F  BA06 3F25 A8FE 3BAE 9A68

next prev parent reply	other threads:[~2010-01-13 15:03 UTC|newest]

Thread overview: 50+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2010-01-13  1:37 [RFC PATCH] introduce sys_membarrier(): process-wide memory barrier (v5) Mathieu Desnoyers
2010-01-13  3:23 ` KOSAKI Motohiro
2010-01-13  3:58   ` Mathieu Desnoyers
2010-01-13  4:47     ` KOSAKI Motohiro
2010-01-13  5:33       ` Paul E. McKenney
2010-01-13 15:03       ` Mathieu Desnoyers [this message]
2010-01-14  0:15         ` KOSAKI Motohiro
2010-01-14  2:16           ` Mathieu Desnoyers
2010-01-14  2:25             ` KOSAKI Motohiro
2010-01-13  5:00 ` Nicholas Miell
2010-01-13  5:31   ` Paul E. McKenney
2010-01-13  5:39     ` Nicholas Miell
2010-01-13 14:38       ` Mathieu Desnoyers
2010-01-13 18:07         ` Nicholas Miell
2010-01-13 18:24           ` Mathieu Desnoyers
2010-01-13 18:41             ` Nicholas Miell
2010-01-13 19:17               ` Mathieu Desnoyers
2010-01-13 19:42                 ` David Daney
2010-01-13 19:53                   ` Nicholas Miell
2010-01-13 23:42                     ` Mathieu Desnoyers
2010-01-13 15:58       ` Paul E. McKenney
2010-01-13 11:07 ` Heiko Carstens
2010-01-13 14:46   ` Mathieu Desnoyers
2010-01-13 16:38 ` Peter Zijlstra
2010-01-13 19:36   ` Mathieu Desnoyers
2010-01-14  9:08     ` Peter Zijlstra
2010-01-14 16:26       ` Mathieu Desnoyers
2010-01-14 17:03         ` Peter Zijlstra
2010-01-14 17:54           ` Mathieu Desnoyers
2010-01-14 18:37             ` Mathieu Desnoyers
2010-01-14 18:52               ` Steven Rostedt
2010-01-14 19:33                 ` Mathieu Desnoyers
2010-01-14 21:26                   ` Steven Rostedt
2010-01-19 18:37                   ` Peter Zijlstra
2010-01-19 19:06                     ` Peter Zijlstra
2010-01-20  3:13                       ` Mathieu Desnoyers
2010-01-20  8:45                         ` Peter Zijlstra
2010-01-21 11:26                       ` Peter Zijlstra
2010-01-21 16:07                         ` Mathieu Desnoyers
2010-01-21 16:12                           ` Steven Rostedt
2010-01-21 16:22                             ` Mathieu Desnoyers
2010-01-21 16:32                               ` Steven Rostedt
2010-01-21 17:02                                 ` Mathieu Desnoyers
2010-01-21 16:17                           ` Peter Zijlstra
2010-01-21 17:01                             ` Mathieu Desnoyers
2010-01-19 19:43                     ` Steven Rostedt
2010-01-14 18:50             ` Steven Rostedt
2010-01-19 16:47         ` Peter Zijlstra
2010-01-19 17:11           ` Mathieu Desnoyers
2010-01-19 17:30           ` Steven Rostedt

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20100113150324.GE30875@Krystal \
    --to=mathieu.desnoyers@polymtl.ca \
    --cc=Valdis.Kletnieks@vt.edu \
    --cc=akpm@linux-foundation.org \
    --cc=dhowells@redhat.com \
    --cc=dipankar@in.ibm.com \
    --cc=josh@joshtriplett.org \
    --cc=kosaki.motohiro@jp.fujitsu.com \
    --cc=laijs@cn.fujitsu.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mingo@elte.hu \
    --cc=oleg@redhat.com \
    --cc=paulmck@linux.vnet.ibm.com \
    --cc=peterz@infradead.org \
    --cc=rostedt@goodmis.org \
    --cc=tglx@linutronix.de \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox