All of lore.kernel.org
 help / color / mirror / Atom feed
From: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
To: Nicholas Piggin <npiggin@gmail.com>
Cc: Peter Zijlstra <peterz@infradead.org>,
	"Paul E. McKenney" <paulmck@linux.vnet.ibm.com>,
	linux-kernel <linux-kernel@vger.kernel.org>,
	Boqun Feng <boqun.feng@gmail.com>, Andrew Hunter <ahh@google.com>,
	maged michael <maged.michael@gmail.com>,
	gromer <gromer@google.com>, Avi Kivity <avi@scylladb.com>,
	Michael Ellerman <mpe@ellerman.id.au>,
	Benjamin Herrenschmidt <benh@kernel.crashing.org>,
	Palmer Dabbelt <palmer@dabbelt.com>,
	Dave Watson <davejwatson@fb.com>
Subject: Re: [RFC PATCH v2] membarrier: expedited private command
Date: Mon, 31 Jul 2017 19:31:19 +0000 (UTC)	[thread overview]
Message-ID: <1551913097.355.1501529479848.JavaMail.zimbra@efficios.com> (raw)
In-Reply-To: <20170729115840.7dff4ea5@roar.ozlabs.ibm.com>

----- On Jul 28, 2017, at 9:58 PM, Nicholas Piggin npiggin@gmail.com wrote:

> On Fri, 28 Jul 2017 17:06:53 +0000 (UTC)
> Mathieu Desnoyers <mathieu.desnoyers@efficios.com> wrote:
> 
>> ----- On Jul 28, 2017, at 12:46 PM, Peter Zijlstra peterz@infradead.org wrote:
>> 
>> > On Fri, Jul 28, 2017 at 03:38:15PM +0000, Mathieu Desnoyers wrote:
>> >> > Which only leaves PPC stranded.. but the 'good' news is that mpe says
>> >> > they'll probably need a barrier in switch_mm() in any case.
>> >> 
>> >> As I pointed out in my other email, I plan to do this:
>> >> 
>> >> --- a/kernel/sched/core.c
>> >> +++ b/kernel/sched/core.c
>> >> @@ -2636,6 +2636,11 @@ static struct rq *finish_task_switch(struct task_struct
>> >> *prev)
>> >>         vtime_task_switch(prev);
>> >>         perf_event_task_sched_in(prev, current);
>> > 
>> > Here would place it _inside_ the rq->lock, which seems to make more
>> > sense given the purpose of the barrier, but either way works given its
>> > definition.
>> 
>> Given its naming "...after_unlock_lock", I thought it would be clearer to put
>> it after the unlock. Anyway, this barrier does not seem to be used to ensure
>> the release barrier per se (unlock already has release semantic), but rather
>> ensures a full memory barrier wrt memory accesses that are synchronized by
>> means other than this this lock.
>> 
>> >   
>> >>         finish_lock_switch(rq, prev);
>> > 
>> > You could put the whole thing inside IS_ENABLED(CONFIG_SYSMEMBARRIER) or
>> > something.
>> 
>> I'm tempted to wait until we hear from powerpc maintainers, so we learn
>> whether they deeply care about this extra barrier in finish_task_switch()
>> before making it conditional on CONFIG_MEMBARRIER.
>> 
>> Having a guaranteed barrier after context switch on all architectures may
>> have other uses.
> 
> I haven't had time to read the thread and understand exactly why you need
> this extra barrier, I'll do it next week. Thanks for cc'ing us on it.
> 
> A smp_mb is pretty expensive on powerpc CPUs. Removing the sync from
> switch_to increased thread switch performance by 2-3%. Putting it in
> switch_mm may be a little less painful, but still we have to weigh it
> against the benefit of this new functionality. Would that be a net win
> for the average end-user? Seems unlikely.
> 
> But we also don't want to lose sys_membarrier completely. Would it be too
> painful to make  MEMBARRIER_CMD_PRIVATE_EXPEDITED return error, or make it
> fall back to a slower case if we decide not to implement it?

The need for an expedited membarrier comes from a need to use it to implement
synchronization schemes like hazard pointers, RCU, and garbage collectors in
user-space. One example is the use-case of hazard pointers. If the memory
free is implemented in the same thread doing the retire, the slowdown
introduced by non-expedited membarrier is not acceptable at all. In that case,
only an expedited membarrier brings an acceptable slowdown. The user's
alternative currently is to rely on undocumented side-effects of mprotect()
to achieve the same result. This happens to work on some architectures, and
may break in the future.

If users do not have membarrier expedited on a given architecture, and are
told that mprotect() does not provide the barrier guarantees they are looking
for, then they would have to add heavy-weight memory barriers on many
user-space fast-paths on those specific architectures, assuming they are
willing to go through that trouble.

I understand that the 2-3% overhead when switching between threads is a big
deal. Do you have numbers on the overhead added by a memory barrier in
switch_mm ? I suspect that switching between processes (including the
cost of following cache line and TLB misses) will be quite heavier in
the first place.

Thanks,

Mathieu

-- 
Mathieu Desnoyers
EfficiOS Inc.
http://www.efficios.com

  parent reply	other threads:[~2017-07-31 19:31 UTC|newest]

Thread overview: 34+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-07-27 21:13 [RFC PATCH v2] membarrier: expedited private command Mathieu Desnoyers
2017-07-27 22:13 ` Paul E. McKenney
2017-07-27 22:41   ` Mathieu Desnoyers
2017-07-27 22:57     ` Paul E. McKenney
2017-07-28  8:55 ` Peter Zijlstra
2017-07-28 11:10   ` Peter Zijlstra
2017-07-28 11:57   ` Peter Zijlstra
2017-07-28 15:38     ` Mathieu Desnoyers
2017-07-28 16:46       ` Peter Zijlstra
2017-07-28 17:06         ` Mathieu Desnoyers
2017-07-29  1:58           ` Nicholas Piggin
2017-07-29  9:23             ` Peter Zijlstra
2017-07-29  9:45               ` Nicholas Piggin
2017-07-29  9:48                 ` Nicholas Piggin
2017-07-29 10:51                   ` Peter Zijlstra
2017-07-31 19:31             ` Mathieu Desnoyers [this message]
2017-07-31 13:20     ` Michael Ellerman
2017-07-31 13:36       ` Peter Zijlstra
2017-08-01  0:35       ` Nicholas Piggin
2017-08-01  1:33         ` Mathieu Desnoyers
2017-08-01  2:00           ` Nicholas Piggin
2017-08-01  8:12             ` Peter Zijlstra
2017-08-01  9:57               ` Nicholas Piggin
2017-08-01 10:22                 ` Peter Zijlstra
2017-08-01 10:32                   ` Avi Kivity
2017-08-01 10:46                     ` Peter Zijlstra
2017-08-01 10:39                   ` Nicholas Piggin
2017-08-01 11:00                     ` Peter Zijlstra
2017-08-01 11:54                       ` Nicholas Piggin
2017-08-01 13:23                   ` Paul E. McKenney
2017-08-01 14:16                     ` Peter Zijlstra
2017-08-01 23:32                       ` Paul E. McKenney
2017-08-02  0:45                         ` Nicholas Piggin
2017-07-28 15:36   ` Mathieu Desnoyers

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1551913097.355.1501529479848.JavaMail.zimbra@efficios.com \
    --to=mathieu.desnoyers@efficios.com \
    --cc=ahh@google.com \
    --cc=avi@scylladb.com \
    --cc=benh@kernel.crashing.org \
    --cc=boqun.feng@gmail.com \
    --cc=davejwatson@fb.com \
    --cc=gromer@google.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=maged.michael@gmail.com \
    --cc=mpe@ellerman.id.au \
    --cc=npiggin@gmail.com \
    --cc=palmer@dabbelt.com \
    --cc=paulmck@linux.vnet.ibm.com \
    --cc=peterz@infradead.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.