From mboxrd@z Thu Jan  1 00:00:00 1970
From: chris@sageembedded.com (Chris Cole)
Date: Tue, 06 Nov 2018 16:12:15 -0800
Subject: [PATCH] mm: improve/fix ARM v7_dma_inv_range() unaligned
 address handling
In-Reply-To: <20181106101437.GY30658@n2100.armlinux.org.uk>
References: <1541444814.537.8.camel@sageembedded.com>
 <20181106101437.GY30658@n2100.armlinux.org.uk>
Message-ID: <1541549535.537.32.camel@sageembedded.com>
To: linux-arm-kernel@lists.infradead.org
List-Id: linux-arm-kernel.lists.infradead.org

On Tue, 2018-11-06 at 10:14 +0000, Russell King - ARM Linux wrote:
> On Mon, Nov 05, 2018 at 11:06:54AM -0800, Chris Cole wrote:
> > This patch addresses possible memory corruption when
> > v7_dma_inv_range(start_address, end_address) address parameters are not
> > aligned to whole cache lines. This function issues "invalidate" cache
> > management operations to all cache lines from start_address (inclusive)
> > to end_address (exclusive). When start_address and/or end_address are
> > not aligned, the start and/or end cache lines are first issued "clean &
> > invalidate" operation. The assumption is this is done to ensure that any
> > dirty data addresses outside the address range (but part of the first or
> > last cache lines) are cleaned/flushed so that data is not lost, which
> > could happen if just an invalidate is issued.
> > 
> > The problem is that these first/last partial cache lines are issued
> > "clean & invalidate" and then "invalidate". This second "invalidate" is
> > not required and worse can cause "lost" writes to addresses outside the
> > address range but part of the cache line. If another component writes to
> > its part of the cache line between the "clean & invalidate" and
> > "invalidate" operations, the write can get lost. This fix is to remove
> > the extra "invalidate" operation when unaligned addressed are used.
> > 
> > A kernel module is available that has a stress test to reproduce the
> > issue and a unit test of the updated v7_dma_inv_range(). It can be
> > downloaded from
> > http://ftp.sageembedded.com/outgoing/linux/cache-test-20181102.tgz.
> > 
> > v7_dma_inv_range() is call by dmac_[un]map_area(addr, len, direction)
> > when the direction is DMA_FROM_DEVICE. One can (I believe) successfully
> > argue that DMA from a device to main memory should use buffers aligned
> > to cache line size, because the "clean & invalidate" might overwrite
> > data that the device just wrote using DMA. But if a driver does use
> > unaligned buffers, at least this fix will prevent memory corruption
> > outside the buffer.
> > 
> > Signed-off-by: chris at sageembedded.com
> 
> I think there's a simpler patch to this:

I took a quick look at the simpler patch you provided and looks
good to me.

Note that in the case of start_address and end_address
that are both unaligned and in the same cache line there will
be two back to back "clean & invalidate" on same cache line.
This will not cause any problems though, so I am okay with it
if you are. (If doing DMA to transfer less then cache line size,
then one propably in not worried about performance impact
of extra "clean & invalidate")

I will run these changes in the kernel test module and
an application that gets glibc heap corruption due to a graphics
driver using unaligned addresses for DMA.

Let me get back to you in a couple of days after testing.

This is my first time submitting a patch to Linux, not sure
what the process is from here.

If testing of your suggesting changes is good, do I provide
a version 2 of the patch with your suggested code?
Or something else?

Thanks for reviewing this patch! (After seeing the volume
of emails on this list, I though it might note get noticed)

Chris


> 
>  arch/arm/mm/cache-v7.S | 8 +++++---
>  1 file changed, 5 insertions(+), 3 deletions(-)
> 
> diff --git a/arch/arm/mm/cache-v7.S b/arch/arm/mm/cache-v7.S
> index 215df435bfb9..2149b47a0c5a 100644
> --- a/arch/arm/mm/cache-v7.S
> +++ b/arch/arm/mm/cache-v7.S
> @@ -360,14 +360,16 @@ ENDPROC(v7_flush_kern_dcache_area)
>  	ALT_UP(W(nop))
>  #endif
>  	mcrne	p15, 0, r0, c7, c14, 1		@ clean & invalidate D / U line
> +	addne	r0, r0, r2			@ next cache line
>  
>  	tst	r1, r3
>  	bic	r1, r1, r3
>  	mcrne	p15, 0, r1, c7, c14, 1		@ clean & invalidate D / U line
> -1:
> -	mcr	p15, 0, r0, c7, c6, 1		@ invalidate D / U line
> -	add	r0, r0, r2
>  	cmp	r0, r1
> +1:
> +	mcrlo	p15, 0, r0, c7, c6, 1		@ invalidate D / U line
> +	addlo	r0, r0, r2
> +	cmplo	r0, r1
>  	blo	1b
>  	dsb	st
>  	ret	lr
>