From: chris@sageembedded.com (Chris Cole)
To: linux-arm-kernel@lists.infradead.org
Subject: [PATCH] mm: improve/fix ARM v7_dma_inv_range() unaligned address handling
Date: Wed, 07 Nov 2018 13:00:59 -0800 [thread overview]
Message-ID: <1541624459.537.42.camel@sageembedded.com> (raw)
In-Reply-To: <1541549535.537.32.camel@sageembedded.com>
On Tue, 2018-11-06 at 16:12 -0800, Chris Cole wrote:
> On Tue, 2018-11-06 at 10:14 +0000, Russell King - ARM Linux wrote:
> > On Mon, Nov 05, 2018 at 11:06:54AM -0800, Chris Cole wrote:
> > > This patch addresses possible memory corruption when
> > > v7_dma_inv_range(start_address, end_address) address parameters are not
> > > aligned to whole cache lines. This function issues "invalidate" cache
> > > management operations to all cache lines from start_address (inclusive)
> > > to end_address (exclusive). When start_address and/or end_address are
> > > not aligned, the start and/or end cache lines are first issued "clean &
> > > invalidate" operation. The assumption is this is done to ensure that any
> > > dirty data addresses outside the address range (but part of the first or
> > > last cache lines) are cleaned/flushed so that data is not lost, which
> > > could happen if just an invalidate is issued.
> > >
> > > The problem is that these first/last partial cache lines are issued
> > > "clean & invalidate" and then "invalidate". This second "invalidate" is
> > > not required and worse can cause "lost" writes to addresses outside the
> > > address range but part of the cache line. If another component writes to
> > > its part of the cache line between the "clean & invalidate" and
> > > "invalidate" operations, the write can get lost. This fix is to remove
> > > the extra "invalidate" operation when unaligned addressed are used.
> > >
> > > A kernel module is available that has a stress test to reproduce the
> > > issue and a unit test of the updated v7_dma_inv_range(). It can be
> > > downloaded from
> > > http://ftp.sageembedded.com/outgoing/linux/cache-test-20181102.tgz.
> > >
> > > v7_dma_inv_range() is call by dmac_[un]map_area(addr, len, direction)
> > > when the direction is DMA_FROM_DEVICE. One can (I believe) successfully
> > > argue that DMA from a device to main memory should use buffers aligned
> > > to cache line size, because the "clean & invalidate" might overwrite
> > > data that the device just wrote using DMA. But if a driver does use
> > > unaligned buffers, at least this fix will prevent memory corruption
> > > outside the buffer.
> > >
> > > Signed-off-by: chris at sageembedded.com
> >
> > I think there's a simpler patch to this:
>
> I took a quick look at the simpler patch you provided and looks
> good to me.
>
> Note that in the case of start_address and end_address
> that are both unaligned and in the same cache line there will
> be two back to back "clean & invalidate" on same cache line.
> This will not cause any problems though, so I am okay with it
> if you are. (If doing DMA to transfer less then cache line size,
> then one propably in not worried about performance impact
> of extra "clean & invalidate")
>
> I will run these changes in the kernel test module and
> an application that gets glibc heap corruption due to a graphics
> driver using unaligned addresses for DMA.
>
> Let me get back to you in a couple of days after testing.
>
> This is my first time submitting a patch to Linux, not sure
> what the process is from here.
>
> If testing of your suggesting changes is good, do I provide
> a version 2 of the patch with your suggested code?
> Or something else?
>
> Thanks for reviewing this patch! (After seeing the volume
> of emails on this list, I though it might note get noticed)
>
> Chris
I have completed testing of your suggested code changes and
varified it fixes the possible memory corruption when unaligned
addresses are used.
I am not sure about the process, but assuming I should send
a version 2 of the patch to linux-arm-kernel with your suggested
code. I will go ahead and do that.
Thanks again,
Chris
>
>
> >
> > arch/arm/mm/cache-v7.S | 8 +++++---
> > 1 file changed, 5 insertions(+), 3 deletions(-)
> >
> > diff --git a/arch/arm/mm/cache-v7.S b/arch/arm/mm/cache-v7.S
> > index 215df435bfb9..2149b47a0c5a 100644
> > --- a/arch/arm/mm/cache-v7.S
> > +++ b/arch/arm/mm/cache-v7.S
> > @@ -360,14 +360,16 @@ ENDPROC(v7_flush_kern_dcache_area)
> > ALT_UP(W(nop))
> > #endif
> > mcrne p15, 0, r0, c7, c14, 1 @ clean & invalidate D / U line
> > + addne r0, r0, r2 @ next cache line
> >
> > tst r1, r3
> > bic r1, r1, r3
> > mcrne p15, 0, r1, c7, c14, 1 @ clean & invalidate D / U line
> > -1:
> > - mcr p15, 0, r0, c7, c6, 1 @ invalidate D / U line
> > - add r0, r0, r2
> > cmp r0, r1
> > +1:
> > + mcrlo p15, 0, r0, c7, c6, 1 @ invalidate D / U line
> > + addlo r0, r0, r2
> > + cmplo r0, r1
> > blo 1b
> > dsb st
> > ret lr
> >
prev parent reply other threads:[~2018-11-07 21:00 UTC|newest]
Thread overview: 4+ messages / expand[flat|nested] mbox.gz Atom feed top
2018-11-05 19:06 [PATCH] mm: improve/fix ARM v7_dma_inv_range() unaligned address handling Chris Cole
2018-11-06 10:14 ` Russell King - ARM Linux
2018-11-07 0:12 ` Chris Cole
2018-11-07 21:00 ` Chris Cole [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=1541624459.537.42.camel@sageembedded.com \
--to=chris@sageembedded.com \
--cc=linux-arm-kernel@lists.infradead.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.