netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Jamie Lokier <jamie@shareable.org>
To: Russell King - ARM Linux <linux@arm.linux.org.uk>
Cc: "Jamie Iles" <jamie@jamieiles.com>,
	gerg@snapgear.com, B32542@freescale.com, netdev@vger.kernel.org,
	s.hauer@pengutronix.de, bryan.wu@canonical.com,
	baruch@tkos.co.il, w.sang@pengutronix.de, r64343@freescale.com,
	"Shawn Guo" <shawn.guo@freescale.com>,
	eric@eukrea.com,
	"Uwe Kleine-König" <u.kleine-koenig@pengutronix.de>,
	davem@davemloft.net, linux-arm-kernel@lists.infradead.org,
	lw@karo-electronics.de
Subject: Re: [PATCH v3 08/10] ARM: mxs: add ocotp read function
Date: Thu, 6 Jan 2011 00:50:52 +0000	[thread overview]
Message-ID: <20110106005052.GA4476@shareable.org> (raw)
In-Reply-To: <20110105201502.GK8638@n2100.arm.linux.org.uk>

Russell King - ARM Linux wrote:
> On Wed, Jan 05, 2011 at 07:44:18PM +0000, Jamie Lokier wrote:
> > 'git show 534be1d5' explains how it works: cpu_relax() flushes buffered
> > writes from _this_ CPU, so that other CPUs which are polling can make
> > progress, which avoids this CPU getting stuck if there is an indirect
> > dependency (no matter how convoluted) between what it's polling and which
> > it wrote just before.
> > 
> > So cpu_relax() is *essential* in some polling loops, not a hint.
> > 
> > In principle that could happen for I/O polling, if (a) buffered memory
> > writes are delayed by I/O read transactions, and (b) the device state we're
> > waiting on depends on I/O yet to be done on another CPU, which could be
> > polling memory first (e.g. a spinlock).
> > 
> > I doubt (a) in practice - but what about buses that block during I/O read?
> > (I have a chip like that here, but it's ARMv4T.)
> 
> Let's be clear - ARMv5 and below generally are well ordered architectures
> within the limits of caching.  There are cases where the write buffer
> allows two writes to pass each other.  However, for IO we generally map
> these - especially for ARMv4 and below - as 'uncacheable unbufferable'.
> So on these, if the program says "read this location" the pipeline will
> stall until the read has been issued - and if you use the result in the
> next instruction, it will stall until the data is available.  So really,
> it's not a problem here.
> 
> ARMv6 and above have a weakly ordered memory model with speculative
> prefetching, so memory reads/writes can be completely unordered.  Device
> accesses can pass memory accesses, but device accesses are always visible
> in program order with respect to each other.
> 
> So, if you're spinning in a loop reading an IO device, all previous IO
> accesses will be completed (in all ARM architectures) before the result
> of your read is evaluated.

No, that wasn't the scenario - it was:

You're spinning reading an IO device, whose state depends indirectly
on a *CPU memory* write that is forever buffered.

(Go and re-read 'git show 534be1d5' if you haven't already.)

The indirect dependence is that another CPU needs to see that write
before it can tell the device to change state in whatever way the
first CPU is polling for.

It's probably clearer in code:

CPU #1

    spin_lock(&mydev->lock);
    /* Look at state. */
    spin_unlock(&mydev->lock);       <-- THIS MEMORY WRITE BUFFERED FOREVER

    /* We expect this to be quick enough that polling is cool. */
    while (readl(mydev->reg_status) & MYDEV_STATUS_BUSY) {
        /* If only we had cpu_relax() */
    }

CPU #2

    spin_lock(&mydev->lock);         <-- STUCK HERE
    /* Look at state. */
    spin_unlock(&mydev->lock);

    writel(MYDEV_TRIGGER, mydev->reg_go);   /* Device is BUSY until this. */

The deadlock in this code (might) happen when CPU #2 is waiting for
the spinlock, and CPU #1's memory write remains in its write buffer
during CPU #1's polling loop.

If that can happen, it's fixed by adding cpu_relax() - to generic
driver code with polling loops.

It can only happen if any CPUs (i.e. ARMv6) that buffer writes due to
prioritising continuous memory reads also have that effect for
continuous IO reads.  This might even apply to non-ARM archs with
non-trivial cpu_relax() definitions; I don't know as they don't always
explain why.

The above driver style isn't particularly obvious, but there are a lot
of drivers with almost every conceivable access pattern.  If you use
your imagination, especially if the second code is an interrupt
handler, it's plausible.  Even though this example would be better
sleeping and waiting normally - there's nothing inherently forbidden
about the above pattern (except that cpu_relax() is needed).

> (But, let's make you squirm some more - mb() on ARMv6 and above may
> equate to a CPU memory barrier _plus_ a few IO accesses to the external
> L2 cache controller - which will be ordered wrt other IO accesses of
> course.)

I squirm at all modern ARM architectures.  Omit the slightest highly
version-specific thing, or run a kernel built with slightly wrong
config options, and it's fine except for random, very rare memory or
I/O corruption.  The workarounds and special bits seem to get more and
more convoluted with each version.

-- Jamie

  reply	other threads:[~2011-01-06  0:51 UTC|newest]

Thread overview: 26+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2011-01-05 14:07 [PATCH v3 00/10] net/fec: add dual fec support for i.MX28 Shawn Guo
2011-01-05 14:07 ` [PATCH v3 01/10] net/fec: fix MMFR_OP type in fec_enet_mdio_write Shawn Guo
2011-01-05 14:07 ` [PATCH v3 02/10] net/fec: remove the use of "index" which is legacy Shawn Guo
2011-01-05 14:07 ` [PATCH v3 03/10] net/fec: add mac field into platform data and consolidate fec_get_mac Shawn Guo
2011-01-05 14:07 ` [PATCH v3 04/10] net/fec: improve pm for better suspend/resume Shawn Guo
2011-01-05 14:07 ` [PATCH v3 05/10] net/fec: add dual fec support for mx28 Shawn Guo
2011-01-05 16:34   ` Uwe Kleine-König
2011-01-06  4:14     ` Shawn Guo
2011-01-06  7:10       ` Uwe Kleine-König
2011-01-07  7:00         ` Shawn Guo
2011-01-07  9:44           ` Uwe Kleine-König
2011-01-05 14:07 ` [PATCH v3 06/10] ARM: mx28: update clock and device name for dual fec support Shawn Guo
2011-01-05 14:07 ` [PATCH v3 07/10] ARM: mx28: add the second fec device registration Shawn Guo
2011-01-05 14:07 ` [PATCH v3 08/10] ARM: mxs: add ocotp read function Shawn Guo
2011-01-05 16:16   ` Jamie Iles
2011-01-05 16:44     ` Uwe Kleine-König
2011-01-05 17:25       ` Jamie Iles
2011-01-05 17:56         ` Jamie Lokier
2011-01-05 18:35           ` Russell King - ARM Linux
2011-01-05 19:44             ` Jamie Lokier
2011-01-05 20:15               ` Russell King - ARM Linux
2011-01-06  0:50                 ` Jamie Lokier [this message]
2011-01-06  9:13                   ` Russell King - ARM Linux
2011-01-06  1:45           ` Shawn Guo
2011-01-05 14:07 ` [PATCH v3 09/10] ARM: mx28: read fec mac address from ocotp Shawn Guo
2011-01-05 14:07 ` [PATCH v3 10/10] ARM: mxs: add initial pm support Shawn Guo

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20110106005052.GA4476@shareable.org \
    --to=jamie@shareable.org \
    --cc=B32542@freescale.com \
    --cc=baruch@tkos.co.il \
    --cc=bryan.wu@canonical.com \
    --cc=davem@davemloft.net \
    --cc=eric@eukrea.com \
    --cc=gerg@snapgear.com \
    --cc=jamie@jamieiles.com \
    --cc=linux-arm-kernel@lists.infradead.org \
    --cc=linux@arm.linux.org.uk \
    --cc=lw@karo-electronics.de \
    --cc=netdev@vger.kernel.org \
    --cc=r64343@freescale.com \
    --cc=s.hauer@pengutronix.de \
    --cc=shawn.guo@freescale.com \
    --cc=u.kleine-koenig@pengutronix.de \
    --cc=w.sang@pengutronix.de \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).