From: linux@arm.linux.org.uk (Russell King - ARM Linux)
To: linux-arm-kernel@lists.infradead.org
Subject: [PATCH] ARM: The mandatory barrier rmb() must be a dsb() in for device accesses
Date: Thu, 28 Apr 2011 19:45:38 +0100 [thread overview]
Message-ID: <20110428184538.GL17290@n2100.arm.linux.org.uk> (raw)
In-Reply-To: <BANLkTimh_TD-6gvbYC4UBCmqvkH8DpbKpQ@mail.gmail.com>
So, what am I supposed to do with 6870/1 (rmb) and the wmb patch?
One or other is wrong - they certainly are not both correct as rmb()
can not be stronger than wmb().
On Mon, Apr 11, 2011 at 01:43:35PM +0800, Ming Lei wrote:
> Hi Catalin,
>
> 2011/4/10 Catalin Marinas <catalin.marinas@arm.com>:
> > On 9 April 2011 20:03, Ming Lei <tom.leiming@gmail.com> wrote:
> >> 2011/4/9 Catalin Marinas <catalin.marinas@arm.com>:
> >>> Probably the above document isn't comprehensive enough. It mainly
> >>> targets memory ordering between processors. I think another example
> >>> that mentions DSB is the mailbox scenario.
> >>
> >> I also saw this example of 8.1, but I don't think it is related with
> >> linux read memory barrier, see below:
> >>
> >> ? ? ? ?- 8.1 requires that the 'STR R5, [R1]' in P1 is completed before
> >> ? ? ? ?'LDR R5, [R1]' in P2;
> >
> > Its not related to the read memory barrier but it refers to ordering
> > between accesses to Normal memory and Device memory. The mailbox here
> > at [R4] can be a device. I'm copying the relevant code here for others
> > following this thread:
> >
> > P1:
> > ? ?STR R5, [R1] ; message stored to shared memory location
> > ? ?DSB [ST]
> > ? ?STR R1, [R4] ; R4 contains the address of a mailbox
> >
> > P2:
> > ? ?; interrupt service routine
> > ? ?LDR R5, [R1]
> >
> >> ? ? ? ?- Documentation/memory-barriers.txt said:
> >> ? ? ? ?1), Memory barriers are such interventions. ?They impose a
> >> ? ? ? ?perceived partial ordering over the memory operations on either
> >> ? ? ? ?side of the barrier.
> >>
> >> ? ? ? ?2),There is no guarantee that any of the memory accesses specified
> >> ? ? ? ?before a memory barrier will be _complete_ by the completion of a
> >> ? ? ? ?memory barrier instruction; the barrier can be considered to draw a
> >> ? ? ? ?line in that CPU's access queue that accesses of the appropriate type
> >> ? ? ? ?may not cross.
> >
> > While the above is correct, the Linux document is quite vague in
> > relation to the mandatory barriers and it's not clear whether the
> > above only refers to the SMP barriers.
>
> Yes, I agree it is vague about mandatory barriers, so make Paul involved in.
>
> >
> >> In fact, IMO, the 'DSB' should be used before 'LDR R5, [R1]' in P2 instead
> >> of in P1 because of the belows:
> >
> > A DSB on one CPU does not control the ordering on another CPU. So the
> > DSB on P2 is useless. Note that the P2 code is executed in an
>
> Yes, your are right, sorry for the noise about this part.
>
> > interrupt routine raised as a result of STR R1, [R4]. If P2 had a loop
> > polling for [R4] to get a valid address, we would indeed need a DSB
> > before LDR R5, [R1].
> >
> >> ? ? ? ?- out of order of the two stores in P1 is allowable
> >
> > That's why we have a DSB in there, to ensure that the [R1] location is
> > written before we store the R1 to the [R4] device.
>
> Yes, I see now, I didn't know the interrupt is triggered by this before.
>
> >
> >> ? ? ? ?- P2 should get message address from the mailbox pointed by R4 first,
> >> ? ? ? ?so how can we make sure the 'STR R1, [R4]' in P1 is completed if
> >> ? ? ? ?R4 doesn't point to a Strongly-Ordered memory.
> >
> > Even if the memory is SO, on newer CPUs I don't think it guarantees
> > the completion (pretty much acting like Device memory). But here we
> > don't care when the store to [R4] completes. When this happen, P2 will
> > get an interrupt. P2 does not execute the LDR R5, [R1] in interrupt
> > routine speculatively (i.e. before the interrupt was received).
> >>> Let's assume we have a device that performs the two steps below:
> >>>
> >>> 1. Writes data to RAM
> >>> 2. Updates its status register
> >>>
> >>> A driver running on the CPU has some code as below:
> >>>
> >>> ? ?LDR [Device] ? ?@ read the device status
> >>> ? ?DMB ? ? ? ? ? ? ? @ current barrier that we have in readl
> >>> ? ?TST ? ? ? ? ? ? ? ? @ check whether the DMA transfer is ready
> >>> ? ?BEQ out
> >>> ? ?LDR [Normal] ? ?@ read the DMA buffer
> >>> ? ?...
> >>> out:
> >>>
> >>> With the code above, the CPU may do the following steps:
> >>>
> >>> 1. Issue read from the device. Note that it does not wait for the read
> >>> to complete.
> >>> 2. DMB - ensures that no subsequent memory accesses happen before the
> >>> previous ones.
> >>> 3. Issues read from normal memory speculatively. This is allowed
> >>> because the TST/BEQ are only flow control dependency. In case the
> >>> condition fails, the read is discarded.
> >>> 4. The read from Normal memory (DMA buffer) completes. This could
> >>> happen before the I/O read at point 1 depending on the bus speeds.
> >>> 5. The Device read completes. This can happen after the Normal read
> >>> because of different bus speeds.
> >>> 6. TST clears CPSR.Z
> >>> 7. BEQ not executed.
> >>> 8. Normal read data moved to register.
> >>>
> >>> So, even if the CPU issues the read from Device and Normal memory in
> >>> order (steps 1, 3), they can happen at the device and RAM level out of
> >>> order (steps 4, 5) and the CPU could read data not yet written by the
> >>> device.
> >>>
> >>> The solution is to use a DSB which ensures the completion of the
> >>> Device read before issuing the Normal memory read.
> >>
> >> I agree a DSB is needed in such case, but I am not sure the patch is
> >> good. Maybe we should keep rmb not changed and only replace __iormb
> >> as DSB if this issue only happens between device io memory and normal
> >> memory, then rmb will not degrade performance a little as does by your
> >> patch.
> >
> > What other use do you see for the mandatory barriers if not in
> > relation to I/O? They are not for SMP configurations. We already have
> > a DSB in wmb().
>
> Yes, there are some usage of rmb/wmb not in relation to I/O, and you can find
> the uses in fs, kernel directory. Also rmb/wmb has been used to order coherent
> memory accesses, such as dma descriptor(see tg3, fireware/ohci, musb,
> ehci/ohci, ...)
>
> So your patch may have performance effect on these drivers and others...
>
> >
> > The mandatory barriers could be present in device drivers that use the
> > relaxed I/O accessors and mandatory barriers explicitly as
> > optimisation. This was the approach recommended in past mailing list
> > discussions.
>
> I think the focus of the discussion is on your change to rmb.
>
> --
> Ming Lei
>
> _______________________________________________
> linux-arm-kernel mailing list
> linux-arm-kernel at lists.infradead.org
> http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
next prev parent reply other threads:[~2011-04-28 18:45 UTC|newest]
Thread overview: 12+ messages / expand[flat|nested] mbox.gz Atom feed top
2011-03-25 10:16 [PATCH] ARM: The mandatory barrier rmb() must be a dsb() in for device accesses Catalin Marinas
2011-03-29 15:02 ` Martin Furmanski
2011-03-29 15:21 ` Catalin Marinas
[not found] ` <AANLkTim1=686NAO7ox8s-HAd_KT4OjiUb1Y7jrUtgia3@mail.gmail.com>
[not found] ` <1301416642.583.101.camel@e102109-lin.cambridge.arm.com>
2011-03-29 21:06 ` Martin Furmanski
2011-03-30 8:53 ` Catalin Marinas
2011-04-07 7:07 ` Ming Lei
2011-04-09 8:57 ` Catalin Marinas
2011-04-09 17:03 ` Ming Lei
2011-04-10 10:10 ` Catalin Marinas
2011-04-11 5:43 ` Ming Lei
2011-04-28 18:45 ` Russell King - ARM Linux [this message]
2011-04-28 21:37 ` Catalin Marinas
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20110428184538.GL17290@n2100.arm.linux.org.uk \
--to=linux@arm.linux.org.uk \
--cc=linux-arm-kernel@lists.infradead.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).