linux-arch.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Will Deacon <will.deacon@arm.com>
To: "Paul E. McKenney" <paulmck@linux.ibm.com>
Cc: linux-kernel@vger.kernel.org, linux-arch@vger.kernel.org,
	mingo@kernel.org, stern@rowland.harvard.edu,
	andrea.parri@amarulasolutions.com, peterz@infradead.org,
	boqun.feng@gmail.com, npiggin@gmail.com, dhowells@redhat.com,
	j.alglave@ucl.ac.uk, luc.maranget@inria.fr, akiyks@gmail.com,
	Benjamin Herrenschmidt <benh@kernel.crashing.org>,
	Michael Ellerman <mpe@ellerman.id.au>,
	Arnd Bergmann <arnd@arndb.de>, Palmer Dabbelt <palmer@sifive.com>,
	Daniel Lustig <dlustig@nvidia.com>,
	Linus Torvalds <torvalds@linux-foundation.org>,
	"Maciej W. Rozycki" <macro@linux-mips.org>,
	Mikulas Patocka <mpatocka@redhat.com>
Subject: Re: [PATCH tip/core/rcu 04/21] docs/memory-barriers.txt: Rewrite "KERNEL I/O BARRIER EFFECTS" section
Date: Tue, 2 Apr 2019 14:03:46 +0100	[thread overview]
Message-ID: <20190402130346.GA14559@fuggles.cambridge.arm.com> (raw)
In-Reply-To: <20190326234133.24962-4-paulmck@linux.ibm.com>

On Tue, Mar 26, 2019 at 04:41:16PM -0700, Paul E. McKenney wrote:
> From: Will Deacon <will.deacon@arm.com>
> 
> The "KERNEL I/O BARRIER EFFECTS" section of memory-barriers.txt is vague,
> x86-centric, out-of-date, incomplete and demonstrably incorrect in places.
> This is largely because I/O ordering is a horrible can of worms, but also
> because the document has stagnated as our understanding has evolved.
> 
> Attempt to address some of that, by rewriting the section based on
> recent(-ish) discussions with Arnd, BenH and others. Maybe one day we'll
> find a way to formalise this stuff, but for now let's at least try to
> make the English easier to understand.
> 
> Cc: "Paul E. McKenney" <paulmck@linux.ibm.com>
> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
> Cc: Michael Ellerman <mpe@ellerman.id.au>
> Cc: Arnd Bergmann <arnd@arndb.de>
> Cc: Peter Zijlstra <peterz@infradead.org>
> Cc: Andrea Parri <andrea.parri@amarulasolutions.com>
> Cc: Palmer Dabbelt <palmer@sifive.com>
> Cc: Daniel Lustig <dlustig@nvidia.com>
> Cc: David Howells <dhowells@redhat.com>
> Cc: Alan Stern <stern@rowland.harvard.edu>
> Cc: Linus Torvalds <torvalds@linux-foundation.org>
> Cc: "Maciej W. Rozycki" <macro@linux-mips.org>
> Cc: Mikulas Patocka <mpatocka@redhat.com>
> Signed-off-by: Will Deacon <will.deacon@arm.com>
> Signed-off-by: Paul E. McKenney <paulmck@linux.ibm.com>
> ---
>  Documentation/memory-barriers.txt | 115 ++++++++++++++++++------------
>  1 file changed, 70 insertions(+), 45 deletions(-)

If somebody could provide an Ack on this patch, I'd really appreciate it,
please. Whilst the portable ordering guarantees that I've documented are
fairly conservative, I do think that this change is a big improvement and
gives you what you need if you're writing a portable device driver for a new
piece of hardware. I'm tackling the removal of MMIOWB as a separate series.

I think Paul now requires an Ack before he'll send a patch to mainline,
hence the grovelling.

Cheers,

Will

> diff --git a/Documentation/memory-barriers.txt b/Documentation/memory-barriers.txt
> index 1c22b21ae922..158947ae78c2 100644
> --- a/Documentation/memory-barriers.txt
> +++ b/Documentation/memory-barriers.txt
> @@ -2599,72 +2599,97 @@ likely, then interrupt-disabling locks should be used to guarantee ordering.
>  KERNEL I/O BARRIER EFFECTS
>  ==========================
>  
> -When accessing I/O memory, drivers should use the appropriate accessor
> -functions:
> +Interfacing with peripherals via I/O accesses is deeply architecture and device
> +specific. Therefore, drivers which are inherently non-portable may rely on
> +specific behaviours of their target systems in order to achieve synchronization
> +in the most lightweight manner possible. For drivers intending to be portable
> +between multiple architectures and bus implementations, the kernel offers a
> +series of accessor functions that provide various degrees of ordering
> +guarantees:
>  
> - (*) inX(), outX():
> + (*) readX(), writeX():
>  
> -     These are intended to talk to I/O space rather than memory space, but
> -     that's primarily a CPU-specific concept.  The i386 and x86_64 processors
> -     do indeed have special I/O space access cycles and instructions, but many
> -     CPUs don't have such a concept.
> +     The readX() and writeX() MMIO accessors take a pointer to the peripheral
> +     being accessed as an __iomem * parameter. For pointers mapped with the
> +     default I/O attributes (e.g. those returned by ioremap()), then the
> +     ordering guarantees are as follows:
>  
> -     The PCI bus, amongst others, defines an I/O space concept which - on such
> -     CPUs as i386 and x86_64 - readily maps to the CPU's concept of I/O
> -     space.  However, it may also be mapped as a virtual I/O space in the CPU's
> -     memory map, particularly on those CPUs that don't support alternate I/O
> -     spaces.
> +     1. All readX() and writeX() accesses to the same peripheral are ordered
> +        with respect to each other. For example, this ensures that MMIO register
> +	writes by the CPU to a particular device will arrive in program order.
>  
> -     Accesses to this space may be fully synchronous (as on i386), but
> -     intermediary bridges (such as the PCI host bridge) may not fully honour
> -     that.
> +     2. A writeX() by the CPU to the peripheral will first wait for the
> +        completion of all prior CPU writes to memory. For example, this ensures
> +        that writes by the CPU to an outbound DMA buffer allocated by
> +        dma_alloc_coherent() will be visible to a DMA engine when the CPU writes
> +        to its MMIO control register to trigger the transfer.
>  
> -     They are guaranteed to be fully ordered with respect to each other.
> +     3. A readX() by the CPU from the peripheral will complete before any
> +	subsequent CPU reads from memory can begin. For example, this ensures
> +	that reads by the CPU from an incoming DMA buffer allocated by
> +	dma_alloc_coherent() will not see stale data after reading from the DMA
> +	engine's MMIO status register to establish that the DMA transfer has
> +	completed.
>  
> -     They are not guaranteed to be fully ordered with respect to other types of
> -     memory and I/O operation.
> +     4. A readX() by the CPU from the peripheral will complete before any
> +	subsequent delay() loop can begin execution. For example, this ensures
> +	that two MMIO register writes by the CPU to a peripheral will arrive at
> +	least 1us apart if the first write is immediately read back with readX()
> +	and udelay(1) is called prior to the second writeX().
>  
> - (*) readX(), writeX():
> +     __iomem pointers obtained with non-default attributes (e.g. those returned
> +     by ioremap_wc()) are unlikely to provide many of these guarantees.
>  
> -     Whether these are guaranteed to be fully ordered and uncombined with
> -     respect to each other on the issuing CPU depends on the characteristics
> -     defined for the memory window through which they're accessing.  On later
> -     i386 architecture machines, for example, this is controlled by way of the
> -     MTRR registers.
> + (*) readX_relaxed(), writeX_relaxed():
>  
> -     Ordinarily, these will be guaranteed to be fully ordered and uncombined,
> -     provided they're not accessing a prefetchable device.
> +     These are similar to readX() and writeX(), but provide weaker memory
> +     ordering guarantees. Specifically, they do not guarantee ordering with
> +     respect to normal memory accesses or delay() loops (i.e bullets 2-4 above)
> +     but they are still guaranteed to be ordered with respect to other accesses
> +     to the same peripheral when operating on __iomem pointers mapped with the
> +     default I/O attributes.
>  
> -     However, intermediary hardware (such as a PCI bridge) may indulge in
> -     deferral if it so wishes; to flush a store, a load from the same location
> -     is preferred[*], but a load from the same device or from configuration
> -     space should suffice for PCI.
> + (*) readsX(), writesX():
>  
> -     [*] NOTE! attempting to load from the same location as was written to may
> -	 cause a malfunction - consider the 16550 Rx/Tx serial registers for
> -	 example.
> +     The readsX() and writesX() MMIO accessors are designed for accessing
> +     register-based, memory-mapped FIFOs residing on peripherals that are not
> +     capable of performing DMA. Consequently, they provide only the ordering
> +     guarantees of readX_relaxed() and writeX_relaxed(), as documented above.
>  
> -     Used with prefetchable I/O memory, an mmiowb() barrier may be required to
> -     force stores to be ordered.
> + (*) inX(), outX():
>  
> -     Please refer to the PCI specification for more information on interactions
> -     between PCI transactions.
> +     The inX() and outX() accessors are intended to access legacy port-mapped
> +     I/O peripherals, which may require special instructions on some
> +     architectures (notably x86). The port number of the peripheral being
> +     accessed is passed as an argument.
>  
> - (*) readX_relaxed(), writeX_relaxed()
> +     Since many CPU architectures ultimately access these peripherals via an
> +     internal virtual memory mapping, the portable ordering guarantees provided
> +     by inX() and outX() are the same as those provided by readX() and writeX()
> +     respectively when accessing a mapping with the default I/O attributes.
>  
> -     These are similar to readX() and writeX(), but provide weaker memory
> -     ordering guarantees.  Specifically, they do not guarantee ordering with
> -     respect to normal memory accesses (e.g. DMA buffers) nor do they guarantee
> -     ordering with respect to LOCK or UNLOCK operations.  If the latter is
> -     required, an mmiowb() barrier can be used.  Note that relaxed accesses to
> -     the same peripheral are guaranteed to be ordered with respect to each
> -     other.
> +     Device drivers may expect outX() to emit a non-posted write transaction
> +     that waits for a completion response from the I/O peripheral before
> +     returning. This is not guaranteed by all architectures and is therefore
> +     not part of the portable ordering semantics.
> +
> + (*) insX(), outsX():
> +
> +     As above, the insX() and outX() accessors provide the same ordering
> +     guarantees as readsX() and writesX() respectively when accessing a mapping
> +     with the default I/O attributes.
>  
>   (*) ioreadX(), iowriteX()
>  
>       These will perform appropriately for the type of access they're actually
>       doing, be it inX()/outX() or readX()/writeX().
>  
> +All of these accessors assume that the underlying peripheral is little-endian,
> +and will therefore perform byte-swapping operations on big-endian architectures.
> +
> +Composing I/O ordering barriers with SMP ordering barriers and LOCK/UNLOCK
> +operations is a dangerous sport which may require the use of mmiowb(). See the
> +subsection "Acquires vs I/O accesses" for more information.
>  
>  ========================================
>  ASSUMED MINIMUM EXECUTION ORDERING MODEL
> -- 
> 2.17.1
> 

  parent reply	other threads:[~2019-04-02 13:03 UTC|newest]

Thread overview: 56+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-03-26 23:41 [PATCH RFC memory-model 0/21] LKMM updates for review Paul E. McKenney
2019-03-26 23:41 ` [PATCH tip/core/rcu 01/21] tools/memory-model: Make scripts be executable Paul E. McKenney
2019-03-26 23:41   ` Paul E. McKenney
2019-03-26 23:41 ` [PATCH tip/core/rcu 02/21] tools/memory-model: Fix comment in MP+poonceonces.litmus Paul E. McKenney
2019-03-26 23:41   ` Paul E. McKenney
2019-03-26 23:41 ` [PATCH RFC memory-model 0/21] LKMM updates for review Paul E. McKenney
2019-03-26 23:41 ` [PATCH tip/core/rcu 03/21] tools/memory-model: Do not use "herd" to refer to "herd7" Paul E. McKenney
2019-03-26 23:41   ` Paul E. McKenney
2019-03-26 23:41 ` [PATCH tip/core/rcu 04/21] docs/memory-barriers.txt: Rewrite "KERNEL I/O BARRIER EFFECTS" section Paul E. McKenney
2019-03-26 23:41   ` Paul E. McKenney
2019-04-02 13:03   ` Will Deacon [this message]
2019-04-02 13:03     ` Will Deacon
2019-04-04 15:58     ` Akira Yokosawa
2019-04-04 15:58       ` Akira Yokosawa
2019-04-04 16:40       ` Will Deacon
2019-04-04 16:40         ` Will Deacon
2019-04-04 22:23         ` Akira Yokosawa
2019-04-04 22:23           ` Akira Yokosawa
2019-03-26 23:41 ` [PATCH tip/core/rcu 05/21] tools/memory-model: Make judgelitmus.sh note timeouts Paul E. McKenney
2019-03-26 23:41   ` Paul E. McKenney
2019-03-26 23:41 ` [PATCH tip/core/rcu 06/21] tools/memory-model: Make cmplitmushist.sh " Paul E. McKenney
2019-03-26 23:41   ` Paul E. McKenney
2019-03-26 23:41 ` [PATCH tip/core/rcu 07/21] tools/memory-model: Make judgelitmus.sh identify bad macros Paul E. McKenney
2019-03-26 23:41   ` Paul E. McKenney
2019-03-26 23:41 ` [PATCH tip/core/rcu 08/21] tools/memory-model: Add support for synchronize_srcu_expedited() Paul E. McKenney
2019-03-26 23:41   ` Paul E. McKenney
2019-04-02 14:49   ` Andrea Parri
2019-04-02 14:49     ` Andrea Parri
2019-04-04 20:50     ` Paul E. McKenney
2019-04-04 20:50       ` Paul E. McKenney
2019-03-26 23:41 ` [PATCH tip/core/rcu 09/21] tools/memory-model: Make judgelitmus.sh detect hard deadlocks Paul E. McKenney
2019-03-26 23:41   ` Paul E. McKenney
2019-03-26 23:41 ` [PATCH tip/core/rcu 10/21] tools/memory-model: Update parseargs.sh for hardware verification Paul E. McKenney
2019-03-26 23:41   ` Paul E. McKenney
2019-03-26 23:41 ` [PATCH tip/core/rcu 11/21] tools/memory-model: Make judgelitmus.sh handle hardware verifications Paul E. McKenney
2019-03-26 23:41   ` Paul E. McKenney
2019-03-26 23:41 ` [PATCH tip/core/rcu 12/21] tools/memory-model: Add simpletest.sh to check locking, RCU, and SRCU Paul E. McKenney
2019-03-26 23:41   ` Paul E. McKenney
2019-03-26 23:41 ` [PATCH tip/core/rcu 13/21] tools/memory-model: Fix checkalllitmus.sh comment Paul E. McKenney
2019-03-26 23:41   ` Paul E. McKenney
2019-03-26 23:41 ` [PATCH tip/core/rcu 14/21] tools/memory-model: Hardware checking for check{,all}litmus.sh Paul E. McKenney
2019-03-26 23:41   ` Paul E. McKenney
2019-03-26 23:41 ` [PATCH tip/core/rcu 15/21] tools/memory-model: Make judgelitmus.sh ransack .litmus.out files Paul E. McKenney
2019-03-26 23:41   ` Paul E. McKenney
2019-03-26 23:41 ` [PATCH tip/core/rcu 16/21] tools/memory-model: Split runlitmus.sh out of checklitmus.sh Paul E. McKenney
2019-03-26 23:41   ` Paul E. McKenney
2019-03-26 23:41 ` [PATCH tip/core/rcu 17/21] tools/memory-model: Make runlitmus.sh generate .litmus.out for --hw Paul E. McKenney
2019-03-26 23:41   ` Paul E. McKenney
2019-03-26 23:41 ` [PATCH tip/core/rcu 18/21] tools/memory-model: Move from .AArch64.litmus.out to .litmus.AArch.out Paul E. McKenney
2019-03-26 23:41   ` Paul E. McKenney
2019-03-26 23:41 ` [PATCH tip/core/rcu 19/21] tools/memory-model: Keep assembly-language litmus tests Paul E. McKenney
2019-03-26 23:41   ` Paul E. McKenney
2019-03-26 23:41 ` [PATCH tip/core/rcu 20/21] tools/memory-model: Allow herd to deduce CPU type Paul E. McKenney
2019-03-26 23:41   ` Paul E. McKenney
2019-03-26 23:41 ` [PATCH tip/core/rcu 21/21] tools/memory-model: Make runlitmus.sh check for jingle errors Paul E. McKenney
2019-03-26 23:41   ` Paul E. McKenney

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20190402130346.GA14559@fuggles.cambridge.arm.com \
    --to=will.deacon@arm.com \
    --cc=akiyks@gmail.com \
    --cc=andrea.parri@amarulasolutions.com \
    --cc=arnd@arndb.de \
    --cc=benh@kernel.crashing.org \
    --cc=boqun.feng@gmail.com \
    --cc=dhowells@redhat.com \
    --cc=dlustig@nvidia.com \
    --cc=j.alglave@ucl.ac.uk \
    --cc=linux-arch@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=luc.maranget@inria.fr \
    --cc=macro@linux-mips.org \
    --cc=mingo@kernel.org \
    --cc=mpatocka@redhat.com \
    --cc=mpe@ellerman.id.au \
    --cc=npiggin@gmail.com \
    --cc=palmer@sifive.com \
    --cc=paulmck@linux.ibm.com \
    --cc=peterz@infradead.org \
    --cc=stern@rowland.harvard.edu \
    --cc=torvalds@linux-foundation.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).