All of lore.kernel.org
 help / color / mirror / Atom feed
From: Nicholas Piggin <npiggin@gmail.com>
To: Andy Lutomirski <luto@amacapital.net>
Cc: Anton Blanchard <anton@ozlabs.org>, Arnd Bergmann <arnd@arndb.de>,
	linux-arch <linux-arch@vger.kernel.org>,
	LKML <linux-kernel@vger.kernel.org>,
	Linux-MM <linux-mm@kvack.org>,
	linuxppc-dev <linuxppc-dev@lists.ozlabs.org>,
	Andy Lutomirski <luto@kernel.org>,
	Mathieu Desnoyers <mathieu.desnoyers@efficios.com>,
	Peter Zijlstra <peterz@infradead.org>, X86 ML <x86@kernel.org>
Subject: Re: [PATCH 2/8] x86: use exit_lazy_tlb rather than membarrier_mm_sync_core_before_usermode
Date: Sun, 06 Dec 2020 09:14:57 +1000	[thread overview]
Message-ID: <1607209402.fogfsh8ov4.astroid@bobo.none> (raw)
In-Reply-To: <116A6B40-C77B-4B6A-897B-18342CD62CEC@amacapital.net>

Excerpts from Andy Lutomirski's message of December 6, 2020 2:11 am:
> 
>> On Dec 5, 2020, at 12:00 AM, Nicholas Piggin <npiggin@gmail.com> wrote:
>> 
>> 
>> I disagree. Until now nobody following it noticed that the mm gets
>> un-lazied in other cases, because that was not too clear from the
>> code (only indirectly using non-standard terminology in the arch
>> support document).
> 
>> In other words, membarrier needs a special sync to deal with the case 
>> when a kthread takes the mm.
> 
> I don’t think this is actually true. Somehow the x86 oddities about 
> CR3 writes leaked too much into the membarrier core code and comments. 
> (I doubt this is x86 specific.  The actual x86 specific part seems to 
> be that we can return to user mode without syncing the instruction 
> stream.)
> 
> As far as I can tell, membarrier doesn’t care at all about laziness. 
> Membarrier cares about rq->curr->mm.  The fact that a cpu can switch 
> its actual loaded mm without scheduling at all (on x86 at least) is 
> entirely beside the point except insofar as it has an effect on 
> whether a subsequent switch_mm() call serializes.

Core membarrier itself doesn't care about laziness, which is why the
membarrier flush should go in exit_lazy_tlb() or other x86 specific
code (at least until more architectures did the same thing and we moved
it into generic code). I just meant this non-serialising return as 
documented in the membarrier arch enablement doc specifies the lazy tlb
requirement.

If an mm was lazy tlb for a kernel thread and then it becomes unlazy,
and if switch_mm is serialising but return to user is not, then you
need a serialising instruction somewhere before return to user. unlazy
is the logical place to add that, because the lazy tlb mm (i.e., 
switching to a kernel thread and back without switching mm) is what 
opens the hole.

> If we notify 
> membarrier about x86’s asynchronous CR3 writes, then membarrier needs 
> to understand what to do with them, which results in an unmaintainable 
> mess in membarrier *and* in the x86 code.

How do you mean? exit_lazy_tlb is the opposite, core scheduler notifying
arch code about when an mm becomes not-lazy, and nothing to do with
membarrier at all even. It's a convenient hook to do your un-lazying.
I guess you can do it also checking things in switch_mm and keeping state
in arch code, I don't think that's necessarily the best place to put it.

So membarrier code is unchanged (it cares that the serialise is done at
un-lazy time), core code is simpler (no knowledge of this membarrier 
quirk and it already knows about lazy-tlb so the calls actually improve 
the documentation), and x86 code I would argue becomes nicer (or no real
difference at worst) because you can move some exit lazy tlb handling to
that specific call rather than decipher it from switch_mm.

> 
> I’m currently trying to document how membarrier actually works, and 
> hopefully this will result in untangling membarrier from mmdrop() and 
> such.

That would be nice.

> 
> A silly part of this is that x86 already has a high quality 
> implementation of most of membarrier(): flush_tlb_mm().  If you flush 
> an mm’s TLB, we carefully propagate the flush to all threads, with 
> attention to memory ordering.  We can’t use this directly as an 
> arch-specific implementation of membarrier because it has the annoying 
> side affect of flushing the TLB and because upcoming hardware might be 
> able to flush without guaranteeing a core sync.  (Upcoming means Zen 
> 3, but the Zen 3 implementation is sadly not usable by Linux.)
> 

A hardware broadcast TLB flush, you mean? What makes it unusable by 
Linux out of curiosity?

WARNING: multiple messages have this Message-ID (diff)
From: Nicholas Piggin <npiggin@gmail.com>
To: Andy Lutomirski <luto@amacapital.net>
Cc: linux-arch <linux-arch@vger.kernel.org>,
	Arnd Bergmann <arnd@arndb.de>,
	Peter Zijlstra <peterz@infradead.org>, X86 ML <x86@kernel.org>,
	LKML <linux-kernel@vger.kernel.org>,
	Linux-MM <linux-mm@kvack.org>,
	Mathieu Desnoyers <mathieu.desnoyers@efficios.com>,
	Andy Lutomirski <luto@kernel.org>,
	linuxppc-dev <linuxppc-dev@lists.ozlabs.org>
Subject: Re: [PATCH 2/8] x86: use exit_lazy_tlb rather than membarrier_mm_sync_core_before_usermode
Date: Sun, 06 Dec 2020 09:14:57 +1000	[thread overview]
Message-ID: <1607209402.fogfsh8ov4.astroid@bobo.none> (raw)
In-Reply-To: <116A6B40-C77B-4B6A-897B-18342CD62CEC@amacapital.net>

Excerpts from Andy Lutomirski's message of December 6, 2020 2:11 am:
> 
>> On Dec 5, 2020, at 12:00 AM, Nicholas Piggin <npiggin@gmail.com> wrote:
>> 
>> 
>> I disagree. Until now nobody following it noticed that the mm gets
>> un-lazied in other cases, because that was not too clear from the
>> code (only indirectly using non-standard terminology in the arch
>> support document).
> 
>> In other words, membarrier needs a special sync to deal with the case 
>> when a kthread takes the mm.
> 
> I don’t think this is actually true. Somehow the x86 oddities about 
> CR3 writes leaked too much into the membarrier core code and comments. 
> (I doubt this is x86 specific.  The actual x86 specific part seems to 
> be that we can return to user mode without syncing the instruction 
> stream.)
> 
> As far as I can tell, membarrier doesn’t care at all about laziness. 
> Membarrier cares about rq->curr->mm.  The fact that a cpu can switch 
> its actual loaded mm without scheduling at all (on x86 at least) is 
> entirely beside the point except insofar as it has an effect on 
> whether a subsequent switch_mm() call serializes.

Core membarrier itself doesn't care about laziness, which is why the
membarrier flush should go in exit_lazy_tlb() or other x86 specific
code (at least until more architectures did the same thing and we moved
it into generic code). I just meant this non-serialising return as 
documented in the membarrier arch enablement doc specifies the lazy tlb
requirement.

If an mm was lazy tlb for a kernel thread and then it becomes unlazy,
and if switch_mm is serialising but return to user is not, then you
need a serialising instruction somewhere before return to user. unlazy
is the logical place to add that, because the lazy tlb mm (i.e., 
switching to a kernel thread and back without switching mm) is what 
opens the hole.

> If we notify 
> membarrier about x86’s asynchronous CR3 writes, then membarrier needs 
> to understand what to do with them, which results in an unmaintainable 
> mess in membarrier *and* in the x86 code.

How do you mean? exit_lazy_tlb is the opposite, core scheduler notifying
arch code about when an mm becomes not-lazy, and nothing to do with
membarrier at all even. It's a convenient hook to do your un-lazying.
I guess you can do it also checking things in switch_mm and keeping state
in arch code, I don't think that's necessarily the best place to put it.

So membarrier code is unchanged (it cares that the serialise is done at
un-lazy time), core code is simpler (no knowledge of this membarrier 
quirk and it already knows about lazy-tlb so the calls actually improve 
the documentation), and x86 code I would argue becomes nicer (or no real
difference at worst) because you can move some exit lazy tlb handling to
that specific call rather than decipher it from switch_mm.

> 
> I’m currently trying to document how membarrier actually works, and 
> hopefully this will result in untangling membarrier from mmdrop() and 
> such.

That would be nice.

> 
> A silly part of this is that x86 already has a high quality 
> implementation of most of membarrier(): flush_tlb_mm().  If you flush 
> an mm’s TLB, we carefully propagate the flush to all threads, with 
> attention to memory ordering.  We can’t use this directly as an 
> arch-specific implementation of membarrier because it has the annoying 
> side affect of flushing the TLB and because upcoming hardware might be 
> able to flush without guaranteeing a core sync.  (Upcoming means Zen 
> 3, but the Zen 3 implementation is sadly not usable by Linux.)
> 

A hardware broadcast TLB flush, you mean? What makes it unusable by 
Linux out of curiosity?

  reply	other threads:[~2020-12-05 23:15 UTC|newest]

Thread overview: 92+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-11-28 16:01 [PATCH 0/8] shoot lazy tlbs Nicholas Piggin
2020-11-28 16:01 ` Nicholas Piggin
2020-11-28 16:01 ` [PATCH 1/8] lazy tlb: introduce exit_lazy_tlb Nicholas Piggin
2020-11-28 16:01   ` Nicholas Piggin
2020-11-29  0:38   ` Andy Lutomirski
2020-11-29  0:38     ` Andy Lutomirski
2020-12-02  2:49     ` Nicholas Piggin
2020-12-02  2:49       ` Nicholas Piggin
2020-11-28 16:01 ` [PATCH 2/8] x86: use exit_lazy_tlb rather than membarrier_mm_sync_core_before_usermode Nicholas Piggin
2020-11-28 16:01   ` Nicholas Piggin
2020-11-28 17:55   ` Andy Lutomirski
2020-11-28 17:55     ` Andy Lutomirski
2020-12-02  2:49     ` Nicholas Piggin
2020-12-02  2:49       ` Nicholas Piggin
2020-12-03  5:09       ` Andy Lutomirski
2020-12-03  5:09         ` Andy Lutomirski
2020-12-05  8:00         ` Nicholas Piggin
2020-12-05  8:00           ` Nicholas Piggin
2020-12-05 16:11           ` Andy Lutomirski
2020-12-05 16:11             ` Andy Lutomirski
2020-12-05 23:14             ` Nicholas Piggin [this message]
2020-12-05 23:14               ` Nicholas Piggin
2020-12-06  0:36               ` Andy Lutomirski
2020-12-06  0:36                 ` Andy Lutomirski
2020-12-06  3:59                 ` Nicholas Piggin
2020-12-06  3:59                   ` Nicholas Piggin
2020-12-11  0:11                   ` Andy Lutomirski
2020-12-11  0:11                     ` Andy Lutomirski
2020-12-14  4:07                     ` Nicholas Piggin
2020-12-14  4:07                       ` Nicholas Piggin
2020-12-14  5:53                       ` Nicholas Piggin
2020-12-14  5:53                         ` Nicholas Piggin
2020-11-30 14:57   ` Mathieu Desnoyers
2020-11-30 14:57     ` Mathieu Desnoyers
2020-11-28 16:01 ` [PATCH 3/8] x86: remove ARCH_HAS_SYNC_CORE_BEFORE_USERMODE Nicholas Piggin
2020-11-28 16:01   ` Nicholas Piggin
2020-11-28 16:01 ` [PATCH 4/8] lazy tlb: introduce lazy mm refcount helper functions Nicholas Piggin
2020-11-28 16:01   ` Nicholas Piggin
2020-11-28 16:01 ` [PATCH 5/8] lazy tlb: allow lazy tlb mm switching to be configurable Nicholas Piggin
2020-11-28 16:01   ` Nicholas Piggin
2020-11-29  0:36   ` Andy Lutomirski
2020-11-29  0:36     ` Andy Lutomirski
2020-12-02  2:49     ` Nicholas Piggin
2020-12-02  2:49       ` Nicholas Piggin
2020-11-28 16:01 ` [PATCH 6/8] lazy tlb: shoot lazies, a non-refcounting lazy tlb option Nicholas Piggin
2020-11-28 16:01   ` Nicholas Piggin
2020-11-29  3:54   ` Andy Lutomirski
2020-11-29  3:54     ` Andy Lutomirski
2020-11-29 20:16     ` Andy Lutomirski
2020-11-29 20:16       ` Andy Lutomirski
2020-11-30  9:25       ` Peter Zijlstra
2020-11-30  9:25         ` Peter Zijlstra
2020-11-30 18:31       ` Andy Lutomirski
2020-11-30 18:31         ` Andy Lutomirski
2020-12-01 21:27         ` Will Deacon
2020-12-01 21:27           ` Will Deacon
2020-12-01 21:50           ` Andy Lutomirski
2020-12-01 21:50             ` Andy Lutomirski
2020-12-01 23:04             ` Will Deacon
2020-12-01 23:04               ` Will Deacon
2020-12-02  3:47         ` Nicholas Piggin
2020-12-02  3:47           ` Nicholas Piggin
2020-12-03  5:05           ` Andy Lutomirski
2020-12-03  5:05             ` Andy Lutomirski
2020-12-03 17:03         ` Alexander Gordeev
2020-12-03 17:03           ` Alexander Gordeev
2020-12-03 17:14           ` Andy Lutomirski
2020-12-03 17:14             ` Andy Lutomirski
2020-12-03 18:33             ` Alexander Gordeev
2020-12-03 18:33               ` Alexander Gordeev
2020-11-30  9:26     ` Peter Zijlstra
2020-11-30  9:26       ` Peter Zijlstra
2020-11-30  9:30     ` Peter Zijlstra
2020-11-30  9:30       ` Peter Zijlstra
2020-11-30  9:34       ` Peter Zijlstra
2020-11-30  9:34         ` Peter Zijlstra
2020-12-02  3:09     ` Nicholas Piggin
2020-12-02  3:09       ` Nicholas Piggin
2020-12-02 11:17   ` Peter Zijlstra
2020-12-02 11:17     ` Peter Zijlstra
2020-12-02 12:45     ` Peter Zijlstra
2020-12-02 12:45       ` Peter Zijlstra
2020-12-02 14:19   ` Peter Zijlstra
2020-12-02 14:19     ` Peter Zijlstra
2020-12-02 14:38     ` Andy Lutomirski
2020-12-02 14:38       ` Andy Lutomirski
2020-12-02 16:29       ` Peter Zijlstra
2020-12-02 16:29         ` Peter Zijlstra
2020-11-28 16:01 ` [PATCH 7/8] powerpc: use lazy mm refcount helper functions Nicholas Piggin
2020-11-28 16:01   ` Nicholas Piggin
2020-11-28 16:01 ` [PATCH 8/8] powerpc/64s: enable MMU_LAZY_TLB_SHOOTDOWN Nicholas Piggin
2020-11-28 16:01   ` Nicholas Piggin

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1607209402.fogfsh8ov4.astroid@bobo.none \
    --to=npiggin@gmail.com \
    --cc=anton@ozlabs.org \
    --cc=arnd@arndb.de \
    --cc=linux-arch@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=linuxppc-dev@lists.ozlabs.org \
    --cc=luto@amacapital.net \
    --cc=luto@kernel.org \
    --cc=mathieu.desnoyers@efficios.com \
    --cc=peterz@infradead.org \
    --cc=x86@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.