From: Will Deacon <will.deacon@arm.com>
To: Linus Torvalds <torvalds@linux-foundation.org>
Cc: "alexander.duyck@gmail.com" <alexander.duyck@gmail.com>,
"linux-arch@vger.kernel.org" <linux-arch@vger.kernel.org>,
Linux Kernel Mailing List <linux-kernel@vger.kernel.org>,
Michael Neuling <mikey@neuling.org>,
Tony Luck <tony.luck@intel.com>,
Mathieu Desnoyers <mathieu.desnoyers@polymtl.ca>,
Alexander Duyck <alexander.h.duyck@redhat.com>,
Peter Zijlstra <peterz@infradead.org>,
Benjamin Herrenschmidt <benh@kernel.crashing.org>,
Heiko Carstens <heiko.carstens@de.ibm.com>,
Oleg Nesterov <oleg@redhat.com>,
Michael Ellerman <michael@ellerman.id.au>,
Geert Uytterhoeven <geert@linux-m68k.org>,
Frederic Weisbecker <fweisbec@gmail.com>,
Martin Schwidefsky <schwidefsky@de.ibm.com>,
Russell King <linux@arm.linux.org.uk>,
"Paul E. McKenney" <paulmck@linux.vnet.ibm.com>,
Ingo Molnar <mingo@kernel.org>
Subject: Re: [PATCH] arch: Introduce read_acquire()
Date: Wed, 12 Nov 2014 10:10:51 +0000 [thread overview]
Message-ID: <20141112101051.GA26437@arm.com> (raw)
In-Reply-To: <CA+55aFwo9f3tWaRqN1Xam9UkWv1B5F4YnRP1Qx3T78E4o=8YJQ@mail.gmail.com>
On Tue, Nov 11, 2014 at 07:40:22PM +0000, Linus Torvalds wrote:
> On Tue, Nov 11, 2014 at 10:57 AM, <alexander.duyck@gmail.com> wrote:
> > On reviewing the documentation and code for smp_load_acquire() it occured
> > to me that implementing something similar for CPU <-> device interraction
> > would be worth while. This commit provides just the load/read side of this
> > in the form of read_acquire().
>
> So I don't hate the concept, but. there's a couple of reasons to think
> this is broken.
>
> One is just the name. Why do we have "smp_load_acquire()", but then
> call the non-smp version "read_acquire()"? That makes very little
> sense to me. Why did "load" become "read"?
[...]
> But we do have a very real difference between "smp_rmb()" (inter-cpu
> cache coherency read barrier) and "rmb()" (full memory barrier that
> synchronizes with IO).
>
> And your patch is very confused about this. In *some* places you use
> "rmb()", and in other places you just use "smp_load_acquire()". Have
> you done extensive verification to check that this is actually ok?
> Because the performance difference you quote very much seems to be
> about your x86 testing now akipping the IO-synchronizing "rmb()", and
> depending on DMA being ordered even without it.
>
> And I'm pretty sure that's actually fine on x86. The real
> IO-synchronizing rmb() (which translates into a lfence) is only needed
> for when you have uncached accesses (ie mmio) on x86. So I don't think
> your code is wrong, I just want to verify that everybody understands
> the issues. I'm not even sure DMA can ever really have weaker memory
> ordering (I really don't see how you'd be able to do a read barrier
> without DMA stores being ordered natively), so maybe I worry too much,
> but the ppc people in particular should look at this, because the ppc
> memory ordering rules and serialization are some completely odd ad-hoc
> black magic....
Right, so now I see what's going on here. This isn't actually anything
to do with acquire/release (I don't know of any architectures that have
a read-barrier-acquire instruction), it's all about DMA to main memory.
If a device is DMA'ing data *and* control information (e.g. 'descriptor
valid') to memory, then it must be maintaining order between those writes
with respect to memory. In that case, using the usual MMIO barriers can
be overkill because we really just want to enforce read-ordering on the CPU
side. In fact, I think you could even do this with a fake address dependency
on ARM (although I'm not actually suggesting we do that).
In light of that, it actually sounds like we want a new set of barrier
macros that apply only to DMA buffer accesses by the CPU -- they wouldn't
enforce ordering against things like MMIO registers. I wonder whether any
architectures would implement them differently to the smp_* flavours?
> But anything with non-cache-coherent DMA is obviously very suspect too.
I think non-cache-coherent DMA should work too (at least, on ARM), but
only for buffers mapped via dma_alloc_coherent (i.e. a non-cacheable
mapping).
Will
next prev parent reply other threads:[~2014-11-12 10:11 UTC|newest]
Thread overview: 15+ messages / expand[flat|nested] mbox.gz Atom feed top
2014-11-11 18:57 [PATCH] arch: Introduce read_acquire() alexander.duyck
2014-11-11 18:57 ` alexander.duyck
2014-11-11 19:40 ` Linus Torvalds
2014-11-11 19:40 ` Linus Torvalds
2014-11-11 20:45 ` Alexander Duyck
2014-11-12 10:10 ` Peter Zijlstra
2014-11-12 10:10 ` Will Deacon [this message]
2014-11-12 15:42 ` Alexander Duyck
2014-11-11 19:47 ` Will Deacon
2014-11-11 21:12 ` Alexander Duyck
2014-11-12 10:15 ` Peter Zijlstra
2014-11-12 15:23 ` Alexander Duyck
2014-11-12 15:37 ` Peter Zijlstra
2014-11-12 19:24 ` Alexander Duyck
2014-11-12 20:43 ` David Miller
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20141112101051.GA26437@arm.com \
--to=will.deacon@arm.com \
--cc=alexander.duyck@gmail.com \
--cc=alexander.h.duyck@redhat.com \
--cc=benh@kernel.crashing.org \
--cc=fweisbec@gmail.com \
--cc=geert@linux-m68k.org \
--cc=heiko.carstens@de.ibm.com \
--cc=linux-arch@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux@arm.linux.org.uk \
--cc=mathieu.desnoyers@polymtl.ca \
--cc=michael@ellerman.id.au \
--cc=mikey@neuling.org \
--cc=mingo@kernel.org \
--cc=oleg@redhat.com \
--cc=paulmck@linux.vnet.ibm.com \
--cc=peterz@infradead.org \
--cc=schwidefsky@de.ibm.com \
--cc=tony.luck@intel.com \
--cc=torvalds@linux-foundation.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox