From: Nick Piggin <nickpiggin@yahoo.com.au>
To: Linus Torvalds <torvalds@linux-foundation.org>,
Matthew Wilcox <matthew@wil.cx>
Cc: Trent Piepho <tpiepho@freescale.com>,
Russell King <rmk+lkml@arm.linux.org.uk>,
Benjamin Herrenschmidt <benh@kernel.crashing.org>,
David Miller <davem@davemloft.net>,
linux-arch@vger.kernel.org, scottwood@freescale.com,
linuxppc-dev@ozlabs.org, alan@lxorguk.ukuu.org.uk,
linux-kernel@vger.kernel.org
Subject: Re: MMIO and gcc re-ordering issue
Date: Tue, 10 Jun 2008 16:56:50 +1000 [thread overview]
Message-ID: <200806101656.51211.nickpiggin@yahoo.com.au> (raw)
In-Reply-To: <alpine.LFD.1.10.0806031153360.3473@woody.linux-foundation.org>
On Wednesday 04 June 2008 05:07, Linus Torvalds wrote:
> On Tue, 3 Jun 2008, Trent Piepho wrote:
> > On Tue, 3 Jun 2008, Linus Torvalds wrote:
> > > On Tue, 3 Jun 2008, Nick Piggin wrote:
> > > > Linus: on x86, memory operations to wc and wc+ memory are not ordered
> > > > with one another, or operations to other memory types (ie. load/load
> > > > and store/store reordering is allowed). Also, as you know, store/load
> > > > reordering is explicitly allowed as well, which covers all memory
> > > > types. So perhaps it is not quite true to say readl/writel is
> > > > strongly ordered by default even on x86. You would have to put in
> > > > some mfence instructions in them to make it so.
> >
> > So on x86, these could be re-ordered?
> >
> > writel(START_OPERATION, CONTROL_REGISTER);
> > status = readl(STATUS_REGISTER);
>
> With both registers in a WC+ area, yes. The write may be in the WC buffers
> until the WC buffers are flushed (short list: a fence, a serializing
> instruction, a read-write to uncached memory, or an interrupt. There are
> others, but those are the main ones).
>
> But if the status register is in uncached memory (which is the only *sane*
> thing to do), then it doesn't matter if the control register is in WC
> memory. Because the status register read is itself serializing with the WC
> buffer, it's actually fine.
>
> So this is used for putting things like ring queues in WC memory, and fill
> them up with writes, and get nice bursty write traffic with the CPU
> automatically buffering it up (think "stdio.h on a really low level"). And
> if you then have the command registers in UC memory or using IO port
> accesses, reading and writing to them will automatically serialize.
OK, I'm sitll not quite sure where this has ended up. I guess you are happy
with x86 semantics as they are now. That is, all IO accesses are strongly
ordered WRT one another and WRT cacheable memory (which includes keeping
them within spinlocks), *unless* one asks for WC memory, in which case that
memory is quite weakly ordered (and is not even ordered by a regular IO
readl, at least according to AMD spec). So for WC memory, one still needs
to use mb/rmb/wmb.
So that still doesn't tell us what *minimum* level of ordering we should
provide in the cross platform readl/writel API. Some relatively sane
suggestions would be:
- as strong as x86. guaranteed not to break drivers that work on x86,
but slower on some archs. To me, this is most pleasing. It is much
much easier to notice something is going a little slower and to work
out how to use weaker ordering there, than it is to debug some
once-in-a-bluemoon breakage caused by just the right architecture,
driver, etc. It totally frees up the driver writer from thinking
about barriers, provided they get the locking right.
- ordered WRT other IO accessors, constrained within spinlocks, but not
cacheable memory. This is what powerpc does now. It's a little faster
for them, and probably covers the vast majority of drivers, but there
are real possibilities to get it wrong (trivial example: using bit
locks or mutexes or any kind of open coded locking or lockless
synchronisation can break).
- (less sane) same as above, but not ordered WRT spinlocks. This is what
ia64 (sn2) does. From a purist POV, it is a little less arbitrary than
powerpc, but in practice, it will break a lot more drivers than powerpc.
I was kind of joking about taking control of this issue :) But seriously,
it needs a decision to be made. I vote for #1. My rationale: I'm still
finding relatively major (well, found maybe 4 or 5 in the last couple of
years) bugs in the mm subsystem due to memory ordering problems. This is
apparently one of the most well reviewed and tested bit of code in the
kernel by people who know all about memory ordering. Not to mention that
mm/ does not have to worry about IO ordering at all. Then apparently
driver are the least reviewed and tested. Connect dots.
Now that doesn't leave waker ordering architectures lumped with "slow old
x86 semantics". Think of it as giving them the benefit of sharing x86
development and testing :) We can then formalise the relaxed __ accessors
to be more complete (ie. +/- byteswapping). I'd also propose to add
io_rmb/io_wmb/io_mb that order io/io access, to help architectures like
sn2 where the io/cacheable barrier is pretty expensive.
Any comments?
next prev parent reply other threads:[~2008-06-10 6:57 UTC|newest]
Thread overview: 148+ messages / expand[flat|nested] mbox.gz Atom feed top
[not found] <4833524C.3040207@freescale.com>
[not found] ` <20080520.153947.84346222.davem@davemloft.net>
[not found] ` <4833542E.3040608@freescale.com>
[not found] ` <20080520.155326.195407196.davem@davemloft.net>
[not found] ` <1211516683.8297.271.camel@pasglop>
[not found] ` <Pine.LNX.4.64.0805221553400.8205@t2.domain.actdsltmp>
2008-05-27 1:33 ` MMIO and gcc re-ordering issue Benjamin Herrenschmidt
2008-05-27 1:40 ` David Miller
2008-05-27 2:15 ` Benjamin Herrenschmidt
2008-05-27 2:28 ` David Miller
2008-05-27 3:39 ` Benjamin Herrenschmidt
2008-05-27 15:35 ` Linus Torvalds
2008-05-27 15:35 ` Linus Torvalds
2008-05-27 16:47 ` Linus Torvalds
2008-05-27 17:31 ` Linus Torvalds
2008-06-02 10:36 ` Ingo Molnar
2008-06-02 21:53 ` Benjamin Herrenschmidt
2008-05-27 21:12 ` Benjamin Herrenschmidt
2008-05-27 18:23 ` Trent Piepho
2008-05-27 18:33 ` Scott Wood
2008-05-27 21:10 ` Benjamin Herrenschmidt
2008-05-27 21:30 ` Linus Torvalds
2008-05-27 21:38 ` Alan Cox
2008-05-27 21:53 ` Matthew Wilcox
2008-05-27 21:46 ` Alan Cox
2008-05-27 22:02 ` Linus Torvalds
2008-05-27 21:59 ` Linus Torvalds
2008-05-27 21:38 ` Benjamin Herrenschmidt
2008-05-27 21:42 ` Matthew Wilcox
2008-05-27 22:17 ` Benjamin Herrenschmidt
2008-05-28 8:36 ` Haavard Skinnemoen
2008-05-29 11:05 ` Pantelis Antoniou
2008-05-30 1:13 ` Benjamin Herrenschmidt
2008-05-30 6:07 ` Haavard Skinnemoen
2008-05-30 7:24 ` Benjamin Herrenschmidt
2008-05-30 8:27 ` Haavard Skinnemoen
2008-05-30 9:22 ` Geert Uytterhoeven
2008-06-02 8:11 ` Haavard Skinnemoen
2008-06-02 15:48 ` Scott Wood
2008-06-03 7:46 ` Haavard Skinnemoen
2008-06-04 15:31 ` Linus Torvalds
2008-05-27 21:55 ` Linus Torvalds
2008-05-27 22:19 ` Benjamin Herrenschmidt
2008-05-29 7:10 ` Arnd Bergmann
2008-05-29 7:10 ` Arnd Bergmann
2008-05-29 10:46 ` Alan Cox
2008-06-02 7:24 ` Russell King
2008-06-02 7:24 ` Russell King
2008-06-03 4:16 ` Nick Piggin
2008-06-03 4:16 ` Nick Piggin
2008-06-03 4:32 ` Benjamin Herrenschmidt
2008-06-03 6:11 ` Nick Piggin
2008-06-03 6:48 ` Benjamin Herrenschmidt
2008-06-03 6:53 ` Paul Mackerras
2008-06-03 7:18 ` Nick Piggin
2008-06-03 7:18 ` Nick Piggin
2008-06-03 14:47 ` Linus Torvalds
2008-06-03 18:47 ` Trent Piepho
2008-06-03 18:55 ` Matthew Wilcox
2008-06-03 19:57 ` Trent Piepho
2008-06-03 21:35 ` Matthew Wilcox
2008-06-03 21:58 ` Trent Piepho
2008-06-04 2:00 ` Nick Piggin
2008-06-03 19:07 ` Linus Torvalds
2008-06-04 2:05 ` Nick Piggin
2008-06-04 2:46 ` Linus Torvalds
2008-06-04 11:47 ` Alan Cox
2008-06-04 11:47 ` Alan Cox
2008-06-10 6:56 ` Nick Piggin [this message]
2008-06-10 17:41 ` Jesse Barnes
2008-06-10 18:10 ` James Bottomley
2008-06-10 19:05 ` Roland Dreier
2008-06-10 19:19 ` Jesse Barnes
2008-06-11 3:29 ` Nick Piggin
2008-06-11 3:29 ` Nick Piggin
2008-06-11 3:40 ` Benjamin Herrenschmidt
2008-06-11 4:06 ` Nick Piggin
2008-06-11 16:07 ` Jesse Barnes
2008-06-11 16:07 ` Jesse Barnes
2008-06-12 11:27 ` Nick Piggin
2008-06-11 4:18 ` Paul Mackerras
2008-06-11 5:00 ` Nick Piggin
2008-06-11 5:13 ` Paul Mackerras
2008-06-11 5:35 ` Nick Piggin
2008-06-11 6:02 ` Nick Piggin
2008-06-12 12:14 ` Paul Mackerras
2008-06-12 13:08 ` Nick Piggin
2008-06-11 14:46 ` Linus Torvalds
2008-06-11 5:20 ` Paul Mackerras
2008-06-11 5:20 ` Paul Mackerras
2008-06-04 2:19 ` Nick Piggin
2008-06-03 19:43 ` Trent Piepho
2008-06-03 21:33 ` Matthew Wilcox
2008-06-03 21:44 ` Trent Piepho
2008-06-04 2:25 ` Nick Piggin
2008-06-04 2:25 ` Nick Piggin
2008-06-04 6:39 ` Trent Piepho
2008-06-03 22:26 ` Benjamin Herrenschmidt
2008-05-27 3:42 ` Arjan van de Ven
2008-05-27 4:08 ` Roland Dreier
2008-05-27 4:20 ` Arjan van de Ven
2008-05-27 7:08 ` Benjamin Herrenschmidt
2008-05-27 7:08 ` Benjamin Herrenschmidt
2008-05-27 15:50 ` Roland Dreier
2008-05-27 16:37 ` James Bottomley
2008-05-27 17:38 ` Roland Dreier
2008-05-27 17:53 ` James Bottomley
2008-05-27 18:07 ` Roland Dreier
2008-05-27 18:17 ` Roland Dreier
2008-05-27 21:23 ` Chris Friesen
2008-05-27 21:23 ` Chris Friesen
2008-05-27 21:29 ` Roland Dreier
2008-05-27 23:04 ` Paul Mackerras
2008-05-27 21:11 ` Benjamin Herrenschmidt
2008-05-27 21:33 ` Roland Dreier
2008-05-27 22:13 ` Benjamin Herrenschmidt
2008-05-27 22:39 ` Roland Dreier
2008-05-29 14:47 ` Jes Sorensen
2008-05-29 15:01 ` James Bottomley
2008-05-30 9:36 ` Jes Sorensen
2008-05-30 17:21 ` Jesse Barnes
2008-05-30 17:21 ` Jesse Barnes
2008-05-31 7:57 ` Jeremy Higdon
2008-05-29 21:40 ` Benjamin Herrenschmidt
2008-05-29 21:48 ` Trent Piepho
2008-05-29 22:05 ` Benjamin Herrenschmidt
2008-05-30 1:53 ` Trent Piepho
2008-05-29 21:53 ` Jesse Barnes
2008-05-29 21:53 ` Jesse Barnes
2008-05-30 9:39 ` Jes Sorensen
2008-05-30 9:48 ` Jes Sorensen
2008-05-31 8:14 ` Pavel Machek
2008-06-02 9:48 ` Jes Sorensen
2008-05-29 22:06 ` Roland Dreier
2008-05-29 22:25 ` Trent Piepho
2008-05-29 22:25 ` Trent Piepho
2008-05-30 3:56 ` Paul Mackerras
2008-05-31 7:52 ` Jeremy Higdon
2008-06-02 9:56 ` Jes Sorensen
2008-06-02 21:02 ` Jeremy Higdon
2008-06-03 4:33 ` Nick Piggin
2008-06-03 8:15 ` Jeremy Higdon
2008-06-03 8:15 ` Jeremy Higdon
2008-06-03 8:19 ` Nick Piggin
2008-06-03 8:45 ` Jeremy Higdon
2008-06-03 16:52 ` Jesse Barnes
2008-06-03 16:52 ` Jesse Barnes
2008-06-05 8:40 ` Jes Sorensen
2008-06-05 8:43 ` Benjamin Herrenschmidt
2008-06-12 15:07 ` Matthew Wilcox
2008-06-13 0:07 ` Benjamin Herrenschmidt
2008-05-31 8:04 ` Pavel Machek
2008-05-27 8:24 ` Alan Cox
2008-05-27 15:28 ` Jonathan Corbet
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=200806101656.51211.nickpiggin@yahoo.com.au \
--to=nickpiggin@yahoo.com.au \
--cc=alan@lxorguk.ukuu.org.uk \
--cc=benh@kernel.crashing.org \
--cc=davem@davemloft.net \
--cc=linux-arch@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linuxppc-dev@ozlabs.org \
--cc=matthew@wil.cx \
--cc=rmk+lkml@arm.linux.org.uk \
--cc=scottwood@freescale.com \
--cc=torvalds@linux-foundation.org \
--cc=tpiepho@freescale.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox