From: Will Deacon <will.deacon@arm.com>
To: Peter Zijlstra <peterz@infradead.org>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>,
torvalds@linux-foundation.org, oleg@redhat.com,
paulmck@linux.vnet.ibm.com, mpe@ellerman.id.au,
npiggin@gmail.com, linux-kernel@vger.kernel.org,
mingo@kernel.org, stern@rowland.harvard.edu,
Mel Gorman <mgorman@suse.de>, Rik van Riel <riel@redhat.com>
Subject: Re: [RFC][PATCH 1/5] mm: Rework {set,clear,mm}_tlb_flush_pending()
Date: Wed, 2 Aug 2017 10:02:21 +0100 [thread overview]
Message-ID: <20170802090220.GE15219@arm.com> (raw)
In-Reply-To: <20170802085111.iupsx6s3hw42a52b@hirez.programming.kicks-ass.net>
On Wed, Aug 02, 2017 at 10:51:11AM +0200, Peter Zijlstra wrote:
> On Wed, Aug 02, 2017 at 09:43:50AM +0100, Will Deacon wrote:
> > On Wed, Aug 02, 2017 at 09:15:23AM +0100, Will Deacon wrote:
>
> > > I really think we should avoid defining TLB invalidation in terms of
> > > smp_mb() because it's a lot more subtle than that.
> >
> > Another worry I have here is with architectures that can optimise the
> > "only need to flush the local TLB" case. For example, this version of 'R':
> >
> >
> > P0:
> > WRITE_ONCE(x, 1);
> > smp_mb();
> > WRITE_ONCE(y, 1);
> >
> > P1:
> > WRITE_ONCE(y, 2);
> > flush_tlb_range(...); // Only needs to flush the local TLB
> > r0 = READ_ONCE(x);
> >
> >
> > It doesn't seem unreasonable to me for y==2 && r0==0 if the
> > flush_tlb_range(...) ends up only doing local invalidation. As a concrete
> > example, imagine a CPU with a page table walker that can snoop the local
> > store-buffer. Then, the local flush_tlb_range in P1 only needs to progress
> > the write to y as far as the store-buffer before it can invalidate the local
> > TLB. Once the TLB is invalidated, it can read x knowing that the translation
> > is up-to-date wrt the page table, but that read doesn't need to wait for
> > write to y to become visible to other CPUs.
> >
> > So flush_tlb_range is actually weaker than smp_mb in some respects, yet the
> > flush_tlb_pending stuff will still work correctly.
>
> So while I think you're right, and we could live with this, after all,
> if we know the mm is CPU local, there shouldn't be any SMP concerns wrt
> its page tables. Do you really want to make this more complicated?
It gives us a nice performance lift on arm64 and I have a patch...[1]
Will
[1]
https://git.kernel.org/pub/scm/linux/kernel/git/will/linux.git/commit/?h=aarch64/devel&id=1c7cf53658f0fa16338d1f8406285ae28fd5f616
next prev parent reply other threads:[~2017-08-02 9:02 UTC|newest]
Thread overview: 41+ messages / expand[flat|nested] mbox.gz Atom feed top
2017-06-07 16:15 [RFC][PATCH 0/5] Getting rid of smp_mb__before_spinlock Peter Zijlstra
2017-06-07 16:15 ` [RFC][PATCH 1/5] mm: Rework {set,clear,mm}_tlb_flush_pending() Peter Zijlstra
2017-06-09 14:45 ` Will Deacon
2017-06-09 18:42 ` Peter Zijlstra
2017-07-28 17:45 ` Peter Zijlstra
2017-08-01 10:31 ` Will Deacon
2017-08-01 12:02 ` Benjamin Herrenschmidt
2017-08-01 12:14 ` Peter Zijlstra
2017-08-01 16:39 ` Peter Zijlstra
2017-08-01 16:44 ` Will Deacon
2017-08-01 16:48 ` Peter Zijlstra
2017-08-01 22:59 ` Peter Zijlstra
2017-08-02 1:23 ` Benjamin Herrenschmidt
2017-08-02 8:11 ` Peter Zijlstra
2017-08-02 8:15 ` Will Deacon
2017-08-02 8:43 ` Will Deacon
2017-08-02 8:51 ` Peter Zijlstra
2017-08-02 9:02 ` Will Deacon [this message]
2017-08-02 22:54 ` Benjamin Herrenschmidt
2017-08-02 8:45 ` Peter Zijlstra
2017-08-02 9:02 ` Will Deacon
2017-08-02 9:18 ` Peter Zijlstra
2017-08-02 13:57 ` Benjamin Herrenschmidt
2017-08-02 15:46 ` Peter Zijlstra
2017-08-02 0:17 ` Benjamin Herrenschmidt
2017-08-01 22:42 ` Benjamin Herrenschmidt
2017-06-07 16:15 ` [RFC][PATCH 2/5] locking: Introduce smp_mb__after_spinlock() Peter Zijlstra
2017-06-07 16:15 ` [RFC][PATCH 3/5] overlayfs: Remove smp_mb__before_spinlock() usage Peter Zijlstra
2017-06-07 16:15 ` [RFC][PATCH 4/5] locking: Remove smp_mb__before_spinlock() Peter Zijlstra
2017-06-07 16:15 ` [RFC][PATCH 5/5] powerpc: Remove SYNC from _switch Peter Zijlstra
2017-06-08 0:32 ` Nicholas Piggin
2017-06-08 6:54 ` Peter Zijlstra
2017-06-08 7:29 ` Nicholas Piggin
2017-06-08 7:57 ` Peter Zijlstra
2017-06-08 8:21 ` Nicholas Piggin
2017-06-08 9:54 ` Michael Ellerman
2017-06-08 10:00 ` Nicholas Piggin
2017-06-08 12:45 ` Peter Zijlstra
2017-06-08 13:18 ` Nicholas Piggin
2017-06-08 13:47 ` Peter Zijlstra
2017-06-09 14:49 ` [RFC][PATCH 0/5] Getting rid of smp_mb__before_spinlock Will Deacon
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20170802090220.GE15219@arm.com \
--to=will.deacon@arm.com \
--cc=benh@kernel.crashing.org \
--cc=linux-kernel@vger.kernel.org \
--cc=mgorman@suse.de \
--cc=mingo@kernel.org \
--cc=mpe@ellerman.id.au \
--cc=npiggin@gmail.com \
--cc=oleg@redhat.com \
--cc=paulmck@linux.vnet.ibm.com \
--cc=peterz@infradead.org \
--cc=riel@redhat.com \
--cc=stern@rowland.harvard.edu \
--cc=torvalds@linux-foundation.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.