* [PATCH] ARM: V6 MPCore v6_dma_inv_range RWFO fix @ 2010-11-23 22:28 Valentine Barshak 2010-11-23 22:42 ` Russell King - ARM Linux 0 siblings, 1 reply; 24+ messages in thread From: Valentine Barshak @ 2010-11-23 22:28 UTC (permalink / raw) To: linux-arm-kernel Cache ownership must be acqired by reading/writing data from the cache line to make cache operation have the desired effect on the SMP MPCore CPU. However, the ownership is never aquired in the v6_dma_inv_range function when cleaning the first line and flushing the last one, in case the address is not aligned to D_CACHE_LINE_SIZE boundary. Fix this by reading/writing data if needed, before performing cache operations. Signed-off-by: Valentine Barshak <vbarshak@mvista.com> Signed-off-by: George G. Davis <gdavis@mvista.com> --- arch/arm/mm/cache-v6.S | 16 ++++++++++++---- 1 files changed, 12 insertions(+), 4 deletions(-) diff --git a/arch/arm/mm/cache-v6.S b/arch/arm/mm/cache-v6.S index 99fa688..e0e5c6b 100644 --- a/arch/arm/mm/cache-v6.S +++ b/arch/arm/mm/cache-v6.S @@ -205,6 +205,10 @@ ENTRY(v6_flush_kern_dcache_area) v6_dma_inv_range: tst r0, #D_CACHE_LINE_SIZE - 1 bic r0, r0, #D_CACHE_LINE_SIZE - 1 +#ifdef CONFIG_DMA_CACHE_RWFO + ldr r2, [r0] @ read for ownership + str r2, [r0] @ write for ownership +#endif #ifdef HARVARD_CACHE mcrne p15, 0, r0, c7, c10, 1 @ clean D line #else @@ -212,16 +216,16 @@ v6_dma_inv_range: #endif tst r1, #D_CACHE_LINE_SIZE - 1 bic r1, r1, #D_CACHE_LINE_SIZE - 1 +#ifdef CONFIG_DMA_CACHE_RWFO + ldrne r2, [r1] @ read for ownership + strne r2, [r1] @ write for ownership +#endif #ifdef HARVARD_CACHE mcrne p15, 0, r1, c7, c14, 1 @ clean & invalidate D line #else mcrne p15, 0, r1, c7, c15, 1 @ clean & invalidate unified line #endif 1: -#ifdef CONFIG_DMA_CACHE_RWFO - ldr r2, [r0] @ read for ownership - str r2, [r0] @ write for ownership -#endif #ifdef HARVARD_CACHE mcr p15, 0, r0, c7, c6, 1 @ invalidate D line #else @@ -229,6 +233,10 @@ v6_dma_inv_range: #endif add r0, r0, #D_CACHE_LINE_SIZE cmp r0, r1 +#ifdef CONFIG_DMA_CACHE_RWFO + ldrlo r2, [r0] @ read for ownership + strlo r2, [r0] @ write for ownership +#endif blo 1b mov r0, #0 mcr p15, 0, r0, c7, c10, 4 @ drain write buffer -- 1.6.0.6 ^ permalink raw reply related [flat|nested] 24+ messages in thread
* [PATCH] ARM: V6 MPCore v6_dma_inv_range RWFO fix 2010-11-23 22:28 [PATCH] ARM: V6 MPCore v6_dma_inv_range RWFO fix Valentine Barshak @ 2010-11-23 22:42 ` Russell King - ARM Linux 2010-11-24 0:24 ` George G. Davis ` (2 more replies) 0 siblings, 3 replies; 24+ messages in thread From: Russell King - ARM Linux @ 2010-11-23 22:42 UTC (permalink / raw) To: linux-arm-kernel On Wed, Nov 24, 2010 at 01:28:06AM +0300, Valentine Barshak wrote: > Cache ownership must be acqired by reading/writing data from the > cache line to make cache operation have the desired effect on the > SMP MPCore CPU. However, the ownership is never aquired in the > v6_dma_inv_range function when cleaning the first line and > flushing the last one, in case the address is not aligned > to D_CACHE_LINE_SIZE boundary. > Fix this by reading/writing data if needed, before performing > cache operations. You should do this on the data _inside_ the requested buffer. We don't know if the overlapping cache line shares itself with some atomic variable, and doing a read-write on it could undo other updates to it. ^ permalink raw reply [flat|nested] 24+ messages in thread
* [PATCH] ARM: V6 MPCore v6_dma_inv_range RWFO fix 2010-11-23 22:42 ` Russell King - ARM Linux @ 2010-11-24 0:24 ` George G. Davis 2010-11-24 10:32 ` Catalin Marinas 2010-11-24 10:42 ` Catalin Marinas 2010-11-24 17:40 ` [PATCH] ARM: V6 MPCore v6_dma_inv_range " Valentine Barshak 2 siblings, 1 reply; 24+ messages in thread From: George G. Davis @ 2010-11-24 0:24 UTC (permalink / raw) To: linux-arm-kernel Hi, On Tue, Nov 23, 2010 at 10:42:37PM +0000, Russell King - ARM Linux wrote: > On Wed, Nov 24, 2010 at 01:28:06AM +0300, Valentine Barshak wrote: > > Cache ownership must be acqired by reading/writing data from the > > cache line to make cache operation have the desired effect on the > > SMP MPCore CPU. However, the ownership is never aquired in the > > v6_dma_inv_range function when cleaning the first line and > > flushing the last one, in case the address is not aligned > > to D_CACHE_LINE_SIZE boundary. > > Fix this by reading/writing data if needed, before performing > > cache operations. > > You should do this on the data _inside_ the requested buffer. We don't > know if the overlapping cache line shares itself with some atomic > variable, and doing a read-write on it could undo other updates to it. OK, how about this (untested interdiff): diff --git a/arch/arm/mm/cache-v6.S b/arch/arm/mm/cache-v6.S index e0e5c6b..e778f2a 100644 --- a/arch/arm/mm/cache-v6.S +++ b/arch/arm/mm/cache-v6.S @@ -206,8 +206,8 @@ v6_dma_inv_range: tst r0, #D_CACHE_LINE_SIZE - 1 bic r0, r0, #D_CACHE_LINE_SIZE - 1 #ifdef CONFIG_DMA_CACHE_RWFO - ldr r2, [r0] @ read for ownership - str r2, [r0] @ write for ownership + ldr r2, [r0, #D_CACHE_LINE_SIZE - 4] @ read for ownership + str r2, [r0, #D_CACHE_LINE_SIZE - 4] @ write for ownership #endif #ifdef HARVARD_CACHE mcrne p15, 0, r0, c7, c10, 1 @ clean D line I believe the above addresses "do this on the data _inside_ the requested buffer" and does no more harm to other data in shared cache lines than the preexisting code but still does not address issues of overlapping shared data within a single word but that seems rather asking for trouble in the first place. Thanks! -- Regards, George ^ permalink raw reply related [flat|nested] 24+ messages in thread
* [PATCH] ARM: V6 MPCore v6_dma_inv_range RWFO fix 2010-11-24 0:24 ` George G. Davis @ 2010-11-24 10:32 ` Catalin Marinas 2010-11-24 15:14 ` George G. Davis 0 siblings, 1 reply; 24+ messages in thread From: Catalin Marinas @ 2010-11-24 10:32 UTC (permalink / raw) To: linux-arm-kernel On 24 November 2010 00:24, George G. Davis <gdavis@mvista.com> wrote: > On Tue, Nov 23, 2010 at 10:42:37PM +0000, Russell King - ARM Linux wrote: >> On Wed, Nov 24, 2010 at 01:28:06AM +0300, Valentine Barshak wrote: >> > Cache ownership must be acqired by reading/writing data from the >> > cache line to make cache operation have the desired effect on the >> > SMP MPCore CPU. However, the ownership is never aquired in the >> > v6_dma_inv_range function when cleaning the first line and >> > flushing the last one, in case the address is not aligned >> > to D_CACHE_LINE_SIZE boundary. >> > Fix this by reading/writing data if needed, before performing >> > cache operations. >> >> You should do this on the data _inside_ the requested buffer. ?We don't >> know if the overlapping cache line shares itself with some atomic >> variable, and doing a read-write on it could undo other updates to it. > > OK, how about this (untested interdiff): > > diff --git a/arch/arm/mm/cache-v6.S b/arch/arm/mm/cache-v6.S > index e0e5c6b..e778f2a 100644 > --- a/arch/arm/mm/cache-v6.S > +++ b/arch/arm/mm/cache-v6.S > @@ -206,8 +206,8 @@ v6_dma_inv_range: > ? ? ? ?tst ? ? r0, #D_CACHE_LINE_SIZE - 1 > ? ? ? ?bic ? ? r0, r0, #D_CACHE_LINE_SIZE - 1 > ?#ifdef CONFIG_DMA_CACHE_RWFO > - ? ? ? ldr ? ? r2, [r0] ? ? ? ? ? ? ? ? ? ? ? ?@ read for ownership > - ? ? ? str ? ? r2, [r0] ? ? ? ? ? ? ? ? ? ? ? ?@ write for ownership > + ? ? ? ldr ? ? r2, [r0, #D_CACHE_LINE_SIZE - 4] ? ? ? ?@ read for ownership > + ? ? ? str ? ? r2, [r0, #D_CACHE_LINE_SIZE - 4] ? ? ? ?@ write for ownership > ?#endif r0 here is already aligned to a cache line, so you still read from the same cache line, I don't see any difference. -- Catalin ^ permalink raw reply [flat|nested] 24+ messages in thread
* [PATCH] ARM: V6 MPCore v6_dma_inv_range RWFO fix 2010-11-24 10:32 ` Catalin Marinas @ 2010-11-24 15:14 ` George G. Davis 0 siblings, 0 replies; 24+ messages in thread From: George G. Davis @ 2010-11-24 15:14 UTC (permalink / raw) To: linux-arm-kernel On Wed, Nov 24, 2010 at 10:32:08AM +0000, Catalin Marinas wrote: > On 24 November 2010 00:24, George G. Davis <gdavis@mvista.com> wrote: > > On Tue, Nov 23, 2010 at 10:42:37PM +0000, Russell King - ARM Linux wrote: > >> On Wed, Nov 24, 2010 at 01:28:06AM +0300, Valentine Barshak wrote: > >> > Cache ownership must be acqired by reading/writing data from the > >> > cache line to make cache operation have the desired effect on the > >> > SMP MPCore CPU. However, the ownership is never aquired in the > >> > v6_dma_inv_range function when cleaning the first line and > >> > flushing the last one, in case the address is not aligned > >> > to D_CACHE_LINE_SIZE boundary. > >> > Fix this by reading/writing data if needed, before performing > >> > cache operations. > >> > >> You should do this on the data _inside_ the requested buffer. ?We don't > >> know if the overlapping cache line shares itself with some atomic > >> variable, and doing a read-write on it could undo other updates to it. > > > > OK, how about this (untested interdiff): > > > > diff --git a/arch/arm/mm/cache-v6.S b/arch/arm/mm/cache-v6.S > > index e0e5c6b..e778f2a 100644 > > --- a/arch/arm/mm/cache-v6.S > > +++ b/arch/arm/mm/cache-v6.S > > @@ -206,8 +206,8 @@ v6_dma_inv_range: > > ? ? ? ?tst ? ? r0, #D_CACHE_LINE_SIZE - 1 > > ? ? ? ?bic ? ? r0, r0, #D_CACHE_LINE_SIZE - 1 > > ?#ifdef CONFIG_DMA_CACHE_RWFO > > - ? ? ? ldr ? ? r2, [r0] ? ? ? ? ? ? ? ? ? ? ? ?@ read for ownership > > - ? ? ? str ? ? r2, [r0] ? ? ? ? ? ? ? ? ? ? ? ?@ write for ownership > > + ? ? ? ldr ? ? r2, [r0, #D_CACHE_LINE_SIZE - 4] ? ? ? ?@ read for ownership > > + ? ? ? str ? ? r2, [r0, #D_CACHE_LINE_SIZE - 4] ? ? ? ?@ write for ownership > > ?#endif > > r0 here is already aligned to a cache line, so you still read from the > same cache line, I don't see any difference. Assuming "start" was not cache line aligned to begin with but let's say it was initially at cache offset 24, then the above meets the contraint that it is inside the buffer and doesn't touch any data which was outside the DMA buffer. I also had a paranoia that start and end may not even be word aligned which is why these RWFOs are placed after those BICs. Thanks! -- Regards, George > > -- > Catalin ^ permalink raw reply [flat|nested] 24+ messages in thread
* [PATCH] ARM: V6 MPCore v6_dma_inv_range RWFO fix 2010-11-23 22:42 ` Russell King - ARM Linux 2010-11-24 0:24 ` George G. Davis @ 2010-11-24 10:42 ` Catalin Marinas 2010-11-24 15:10 ` George G. Davis 2010-11-24 17:33 ` Russell King - ARM Linux 2010-11-24 17:40 ` [PATCH] ARM: V6 MPCore v6_dma_inv_range " Valentine Barshak 2 siblings, 2 replies; 24+ messages in thread From: Catalin Marinas @ 2010-11-24 10:42 UTC (permalink / raw) To: linux-arm-kernel On Tue, 2010-11-23 at 22:42 +0000, Russell King - ARM Linux wrote: > On Wed, Nov 24, 2010 at 01:28:06AM +0300, Valentine Barshak wrote: > > Cache ownership must be acqired by reading/writing data from the > > cache line to make cache operation have the desired effect on the > > SMP MPCore CPU. However, the ownership is never aquired in the > > v6_dma_inv_range function when cleaning the first line and > > flushing the last one, in case the address is not aligned > > to D_CACHE_LINE_SIZE boundary. > > Fix this by reading/writing data if needed, before performing > > cache operations. > > You should do this on the data _inside_ the requested buffer. We don't > know if the overlapping cache line shares itself with some atomic > variable, and doing a read-write on it could undo other updates to it. We could just use the boundary addresses to avoid writing beyond the buffer. Something like below (pretty much moving the BIC after the RFO, untested): diff --git a/arch/arm/mm/cache-v6.S b/arch/arm/mm/cache-v6.S index 99fa688..d63bb55 100644 --- a/arch/arm/mm/cache-v6.S +++ b/arch/arm/mm/cache-v6.S @@ -204,6 +204,10 @@ ENTRY(v6_flush_kern_dcache_area) */ v6_dma_inv_range: tst r0, #D_CACHE_LINE_SIZE - 1 +#ifdef CONFIG_DMA_CACHE_RWFO + ldrne r2, [r0] @ read for ownership + strne r2, [r0] @ write for ownership +#endif bic r0, r0, #D_CACHE_LINE_SIZE - 1 #ifdef HARVARD_CACHE mcrne p15, 0, r0, c7, c10, 1 @ clean D line @@ -211,6 +215,10 @@ v6_dma_inv_range: mcrne p15, 0, r0, c7, c11, 1 @ clean unified line #endif tst r1, #D_CACHE_LINE_SIZE - 1 +#ifdef CONFIG_DMA_CACHE_RWFO + ldrne r2, [r1, #-4] @ read for ownership + strne r2, [r1, #-4] @ write for ownership +#endif bic r1, r1, #D_CACHE_LINE_SIZE - 1 #ifdef HARVARD_CACHE mcrne p15, 0, r1, c7, c14, 1 @ clean & invalidate D line -- Catalin ^ permalink raw reply related [flat|nested] 24+ messages in thread
* [PATCH] ARM: V6 MPCore v6_dma_inv_range RWFO fix 2010-11-24 10:42 ` Catalin Marinas @ 2010-11-24 15:10 ` George G. Davis 2010-11-24 15:35 ` Catalin Marinas 2010-11-24 17:33 ` Russell King - ARM Linux 1 sibling, 1 reply; 24+ messages in thread From: George G. Davis @ 2010-11-24 15:10 UTC (permalink / raw) To: linux-arm-kernel On Wed, Nov 24, 2010 at 10:42:13AM +0000, Catalin Marinas wrote: > On Tue, 2010-11-23 at 22:42 +0000, Russell King - ARM Linux wrote: > > On Wed, Nov 24, 2010 at 01:28:06AM +0300, Valentine Barshak wrote: > > > Cache ownership must be acqired by reading/writing data from the > > > cache line to make cache operation have the desired effect on the > > > SMP MPCore CPU. However, the ownership is never aquired in the > > > v6_dma_inv_range function when cleaning the first line and > > > flushing the last one, in case the address is not aligned > > > to D_CACHE_LINE_SIZE boundary. > > > Fix this by reading/writing data if needed, before performing > > > cache operations. > > > > You should do this on the data _inside_ the requested buffer. We don't > > know if the overlapping cache line shares itself with some atomic > > variable, and doing a read-write on it could undo other updates to it. > > We could just use the boundary addresses to avoid writing beyond the > buffer. Something like below (pretty much moving the BIC after the RFO, > untested): > > > diff --git a/arch/arm/mm/cache-v6.S b/arch/arm/mm/cache-v6.S > index 99fa688..d63bb55 100644 > --- a/arch/arm/mm/cache-v6.S > +++ b/arch/arm/mm/cache-v6.S > @@ -204,6 +204,10 @@ ENTRY(v6_flush_kern_dcache_area) > */ > v6_dma_inv_range: > tst r0, #D_CACHE_LINE_SIZE - 1 > +#ifdef CONFIG_DMA_CACHE_RWFO > + ldrne r2, [r0] @ read for ownership > + strne r2, [r0] @ write for ownership My concern doing RWFO prior to aligning "start" is that it could be non-word-aligned which may result in an alignment fault. > +#endif > bic r0, r0, #D_CACHE_LINE_SIZE - 1 > #ifdef HARVARD_CACHE > mcrne p15, 0, r0, c7, c10, 1 @ clean D line > @@ -211,6 +215,10 @@ v6_dma_inv_range: > mcrne p15, 0, r0, c7, c11, 1 @ clean unified line > #endif > tst r1, #D_CACHE_LINE_SIZE - 1 > +#ifdef CONFIG_DMA_CACHE_RWFO > + ldrne r2, [r1, #-4] @ read for ownership > + strne r2, [r1, #-4] @ write for ownership Same concern here with "end". Are start and end reasonably expected to be word aligned? Thanks! -- Regards, George > +#endif > bic r1, r1, #D_CACHE_LINE_SIZE - 1 > #ifdef HARVARD_CACHE > mcrne p15, 0, r1, c7, c14, 1 @ clean & invalidate D line > > > > -- > Catalin > > ^ permalink raw reply [flat|nested] 24+ messages in thread
* [PATCH] ARM: V6 MPCore v6_dma_inv_range RWFO fix 2010-11-24 15:10 ` George G. Davis @ 2010-11-24 15:35 ` Catalin Marinas 0 siblings, 0 replies; 24+ messages in thread From: Catalin Marinas @ 2010-11-24 15:35 UTC (permalink / raw) To: linux-arm-kernel On 24 November 2010 15:10, George G. Davis <gdavis@mvista.com> wrote: > On Wed, Nov 24, 2010 at 10:42:13AM +0000, Catalin Marinas wrote: >> On Tue, 2010-11-23 at 22:42 +0000, Russell King - ARM Linux wrote: >> > On Wed, Nov 24, 2010 at 01:28:06AM +0300, Valentine Barshak wrote: >> > > Cache ownership must be acqired by reading/writing data from the >> > > cache line to make cache operation have the desired effect on the >> > > SMP MPCore CPU. However, the ownership is never aquired in the >> > > v6_dma_inv_range function when cleaning the first line and >> > > flushing the last one, in case the address is not aligned >> > > to D_CACHE_LINE_SIZE boundary. >> > > Fix this by reading/writing data if needed, before performing >> > > cache operations. >> > >> > You should do this on the data _inside_ the requested buffer. ?We don't >> > know if the overlapping cache line shares itself with some atomic >> > variable, and doing a read-write on it could undo other updates to it. >> >> We could just use the boundary addresses to avoid writing beyond the >> buffer. Something like below (pretty much moving the BIC after the RFO, >> untested): >> >> >> diff --git a/arch/arm/mm/cache-v6.S b/arch/arm/mm/cache-v6.S >> index 99fa688..d63bb55 100644 >> --- a/arch/arm/mm/cache-v6.S >> +++ b/arch/arm/mm/cache-v6.S >> @@ -204,6 +204,10 @@ ENTRY(v6_flush_kern_dcache_area) >> ? */ >> ?v6_dma_inv_range: >> ? ? ? tst ? ? r0, #D_CACHE_LINE_SIZE - 1 >> +#ifdef CONFIG_DMA_CACHE_RWFO >> + ? ? ldrne ? r2, [r0] ? ? ? ? ? ? ? ? ? ? ? ?@ read for ownership >> + ? ? strne ? r2, [r0] ? ? ? ? ? ? ? ? ? ? ? ?@ write for ownership > > My concern doing RWFO prior to aligning "start" is that it could be > non-word-aligned which may result in an alignment fault. True. Maybe you can use ldrbne/strbne. -- Catalin ^ permalink raw reply [flat|nested] 24+ messages in thread
* [PATCH] ARM: V6 MPCore v6_dma_inv_range RWFO fix 2010-11-24 10:42 ` Catalin Marinas 2010-11-24 15:10 ` George G. Davis @ 2010-11-24 17:33 ` Russell King - ARM Linux 2010-11-24 17:46 ` Catalin Marinas 1 sibling, 1 reply; 24+ messages in thread From: Russell King - ARM Linux @ 2010-11-24 17:33 UTC (permalink / raw) To: linux-arm-kernel On Wed, Nov 24, 2010 at 10:42:13AM +0000, Catalin Marinas wrote: > On Tue, 2010-11-23 at 22:42 +0000, Russell King - ARM Linux wrote: > > On Wed, Nov 24, 2010 at 01:28:06AM +0300, Valentine Barshak wrote: > > > Cache ownership must be acqired by reading/writing data from the > > > cache line to make cache operation have the desired effect on the > > > SMP MPCore CPU. However, the ownership is never aquired in the > > > v6_dma_inv_range function when cleaning the first line and > > > flushing the last one, in case the address is not aligned > > > to D_CACHE_LINE_SIZE boundary. > > > Fix this by reading/writing data if needed, before performing > > > cache operations. > > > > You should do this on the data _inside_ the requested buffer. We don't > > know if the overlapping cache line shares itself with some atomic > > variable, and doing a read-write on it could undo other updates to it. > > We could just use the boundary addresses to avoid writing beyond the > buffer. Something like below (pretty much moving the BIC after the RFO, > untested): What if the pointer is not word aligned? The safest thing to do is: tst r0, #D_CACHE_LINE_SIZE - 1 bic r0, r0, #D_CACHE_LINE_SIZE - 1 ldrneb r2, [r0, #D_CACHE_LINE_SIZE - 1] strneb r2, [r0, #D_CACHE_LINE_SIZE - 1] ... tst r1, #D_CACHE_LINE_SIZE - 1 bic r1, r1, #D_CACHE_LINE_SIZE - 1 ldrneb r2, [r0] strneb r2, [r0] so that we only touch the very last byte of the first cache line, and the very first byte of the last cache line. That provides for the most safe behaviour that we can manage in all scenarios. ^ permalink raw reply [flat|nested] 24+ messages in thread
* [PATCH] ARM: V6 MPCore v6_dma_inv_range RWFO fix 2010-11-24 17:33 ` Russell King - ARM Linux @ 2010-11-24 17:46 ` Catalin Marinas 2010-11-24 18:01 ` Russell King - ARM Linux 0 siblings, 1 reply; 24+ messages in thread From: Catalin Marinas @ 2010-11-24 17:46 UTC (permalink / raw) To: linux-arm-kernel On Wed, 2010-11-24 at 17:33 +0000, Russell King - ARM Linux wrote: > On Wed, Nov 24, 2010 at 10:42:13AM +0000, Catalin Marinas wrote: > > On Tue, 2010-11-23 at 22:42 +0000, Russell King - ARM Linux wrote: > > > On Wed, Nov 24, 2010 at 01:28:06AM +0300, Valentine Barshak wrote: > > > > Cache ownership must be acqired by reading/writing data from the > > > > cache line to make cache operation have the desired effect on the > > > > SMP MPCore CPU. However, the ownership is never aquired in the > > > > v6_dma_inv_range function when cleaning the first line and > > > > flushing the last one, in case the address is not aligned > > > > to D_CACHE_LINE_SIZE boundary. > > > > Fix this by reading/writing data if needed, before performing > > > > cache operations. > > > > > > You should do this on the data _inside_ the requested buffer. We don't > > > know if the overlapping cache line shares itself with some atomic > > > variable, and doing a read-write on it could undo other updates to it. > > > > We could just use the boundary addresses to avoid writing beyond the > > buffer. Something like below (pretty much moving the BIC after the RFO, > > untested): > > What if the pointer is not word aligned? I followed up in my reply to George. > The safest thing to do is: > > tst r0, #D_CACHE_LINE_SIZE - 1 > bic r0, r0, #D_CACHE_LINE_SIZE - 1 > ldrneb r2, [r0, #D_CACHE_LINE_SIZE - 1] > strneb r2, [r0, #D_CACHE_LINE_SIZE - 1] Should we always assume that the buffer size is at least a cache line? I think the current code assumes this already. > ... > tst r1, #D_CACHE_LINE_SIZE - 1 > bic r1, r1, #D_CACHE_LINE_SIZE - 1 > ldrneb r2, [r0] > strneb r2, [r0] Did you mean "ldrneb r2, [r1]"? Since the interval is exclusive, what if r1 is already cache-line aligned? -- Catalin ^ permalink raw reply [flat|nested] 24+ messages in thread
* [PATCH] ARM: V6 MPCore v6_dma_inv_range RWFO fix 2010-11-24 17:46 ` Catalin Marinas @ 2010-11-24 18:01 ` Russell King - ARM Linux 2010-11-24 18:07 ` Catalin Marinas 0 siblings, 1 reply; 24+ messages in thread From: Russell King - ARM Linux @ 2010-11-24 18:01 UTC (permalink / raw) To: linux-arm-kernel On Wed, Nov 24, 2010 at 05:46:03PM +0000, Catalin Marinas wrote: > On Wed, 2010-11-24 at 17:33 +0000, Russell King - ARM Linux wrote: > > What if the pointer is not word aligned? > > I followed up in my reply to George. > > > The safest thing to do is: > > > > tst r0, #D_CACHE_LINE_SIZE - 1 > > bic r0, r0, #D_CACHE_LINE_SIZE - 1 > > ldrneb r2, [r0, #D_CACHE_LINE_SIZE - 1] > > strneb r2, [r0, #D_CACHE_LINE_SIZE - 1] > > Should we always assume that the buffer size is at least a cache line? I > think the current code assumes this already. Hmm, true. In that case: tst r0, #D_CACHE_LINE_SIZE - 1 ldrneb r2, [r0] strneb r2, [r0] bic r0, r0, #D_CACHE_LINE_SIZE - 1 > > tst r1, #D_CACHE_LINE_SIZE - 1 > > bic r1, r1, #D_CACHE_LINE_SIZE - 1 > > ldrneb r2, [r0] > > strneb r2, [r0] > > Did you mean "ldrneb r2, [r1]"? Yes. However, for the same reason above: tst r1, #D_CACHE_LINE_SIZE - 1 ldrneb r2, [r1, #-1] strneb r2, [r1, #-1] bic r1, r1, #D_CACHE_LINE_SIZE - 1 > Since the interval is exclusive, what if r1 is already cache-line > aligned? If r1 is cache line aligned, then we won't be touching that cache line except for the main loop, which should already do what is necessary. ^ permalink raw reply [flat|nested] 24+ messages in thread
* [PATCH] ARM: V6 MPCore v6_dma_inv_range RWFO fix 2010-11-24 18:01 ` Russell King - ARM Linux @ 2010-11-24 18:07 ` Catalin Marinas 2010-11-24 18:39 ` [PATCH] ARM: V6 MPCore v6_dma_inv_range and v6_dma_flush_range " Valentine Barshak 2010-11-29 15:28 ` Valentine Barshak 0 siblings, 2 replies; 24+ messages in thread From: Catalin Marinas @ 2010-11-24 18:07 UTC (permalink / raw) To: linux-arm-kernel On Wed, 2010-11-24 at 18:01 +0000, Russell King - ARM Linux wrote: > On Wed, Nov 24, 2010 at 05:46:03PM +0000, Catalin Marinas wrote: > > On Wed, 2010-11-24 at 17:33 +0000, Russell King - ARM Linux wrote: > > > What if the pointer is not word aligned? > > > > I followed up in my reply to George. > > > > > The safest thing to do is: > > > > > > tst r0, #D_CACHE_LINE_SIZE - 1 > > > bic r0, r0, #D_CACHE_LINE_SIZE - 1 > > > ldrneb r2, [r0, #D_CACHE_LINE_SIZE - 1] > > > strneb r2, [r0, #D_CACHE_LINE_SIZE - 1] > > > > Should we always assume that the buffer size is at least a cache line? I > > think the current code assumes this already. > > Hmm, true. In that case: > > tst r0, #D_CACHE_LINE_SIZE - 1 > ldrneb r2, [r0] > strneb r2, [r0] > bic r0, r0, #D_CACHE_LINE_SIZE - 1 > > > > > tst r1, #D_CACHE_LINE_SIZE - 1 > > > bic r1, r1, #D_CACHE_LINE_SIZE - 1 > > > ldrneb r2, [r0] > > > strneb r2, [r0] > > > > Did you mean "ldrneb r2, [r1]"? > > Yes. However, for the same reason above: > > tst r1, #D_CACHE_LINE_SIZE - 1 > ldrneb r2, [r1, #-1] > strneb r2, [r1, #-1] > bic r1, r1, #D_CACHE_LINE_SIZE - 1 This looks OK to me (though not sure about Thumb-2, I haven't checked). -- Catalin ^ permalink raw reply [flat|nested] 24+ messages in thread
* [PATCH] ARM: V6 MPCore v6_dma_inv_range and v6_dma_flush_range RWFO fix 2010-11-24 18:07 ` Catalin Marinas @ 2010-11-24 18:39 ` Valentine Barshak 2010-12-08 10:25 ` Valentine Barshak ` (2 more replies) 2010-11-29 15:28 ` Valentine Barshak 1 sibling, 3 replies; 24+ messages in thread From: Valentine Barshak @ 2010-11-24 18:39 UTC (permalink / raw) To: linux-arm-kernel Updated according to the comments to avoid r/w outside the buffer and used byte r/w for the possible unaligned data. Seems to work fine. Cache ownership must be acqired by reading/writing data from the cache line to make cache operation have the desired effect on the SMP MPCore CPU. However, the ownership is never aquired in the v6_dma_inv_range function when cleaning the first line and flushing the last one, in case the address is not aligned to D_CACHE_LINE_SIZE boundary. Fix this by reading/writing data if needed, before performing cache operations. While at it, fix v6_dma_flush_range to prevent RWFO outside the buffer. Signed-off-by: Valentine Barshak <vbarshak@mvista.com> Signed-off-by: George G. Davis <gdavis@mvista.com> --- arch/arm/mm/cache-v6.S | 40 ++++++++++++++++++++++++++-------------- 1 files changed, 26 insertions(+), 14 deletions(-) diff --git a/arch/arm/mm/cache-v6.S b/arch/arm/mm/cache-v6.S index 99fa688..622f8a1 100644 --- a/arch/arm/mm/cache-v6.S +++ b/arch/arm/mm/cache-v6.S @@ -203,25 +203,29 @@ ENTRY(v6_flush_kern_dcache_area) * - end - virtual end address of region */ v6_dma_inv_range: - tst r0, #D_CACHE_LINE_SIZE - 1 - bic r0, r0, #D_CACHE_LINE_SIZE - 1 -#ifdef HARVARD_CACHE - mcrne p15, 0, r0, c7, c10, 1 @ clean D line -#else - mcrne p15, 0, r0, c7, c11, 1 @ clean unified line -#endif tst r1, #D_CACHE_LINE_SIZE - 1 +#ifdef CONFIG_DMA_CACHE_RWFO + ldrneb r2, [r1, #-1] @ read for ownership + strneb r2, [r1, #-1] @ write for ownership +#endif bic r1, r1, #D_CACHE_LINE_SIZE - 1 #ifdef HARVARD_CACHE mcrne p15, 0, r1, c7, c14, 1 @ clean & invalidate D line #else mcrne p15, 0, r1, c7, c15, 1 @ clean & invalidate unified line #endif -1: #ifdef CONFIG_DMA_CACHE_RWFO - ldr r2, [r0] @ read for ownership - str r2, [r0] @ write for ownership + ldrb r2, [r0] @ read for ownership + strb r2, [r0] @ write for ownership +#endif + tst r0, #D_CACHE_LINE_SIZE - 1 + bic r0, r0, #D_CACHE_LINE_SIZE - 1 +#ifdef HARVARD_CACHE + mcrne p15, 0, r0, c7, c10, 1 @ clean D line +#else + mcrne p15, 0, r0, c7, c11, 1 @ clean unified line #endif +1: #ifdef HARVARD_CACHE mcr p15, 0, r0, c7, c6, 1 @ invalidate D line #else @@ -229,6 +233,10 @@ v6_dma_inv_range: #endif add r0, r0, #D_CACHE_LINE_SIZE cmp r0, r1 +#ifdef CONFIG_DMA_CACHE_RWFO + ldrlo r2, [r0] @ read for ownership + strlo r2, [r0] @ write for ownership +#endif blo 1b mov r0, #0 mcr p15, 0, r0, c7, c10, 4 @ drain write buffer @@ -263,12 +271,12 @@ v6_dma_clean_range: * - end - virtual end address of region */ ENTRY(v6_dma_flush_range) - bic r0, r0, #D_CACHE_LINE_SIZE - 1 -1: #ifdef CONFIG_DMA_CACHE_RWFO - ldr r2, [r0] @ read for ownership - str r2, [r0] @ write for ownership + ldrb r2, [r0] @ read for ownership + strb r2, [r0] @ write for ownership #endif + bic r0, r0, #D_CACHE_LINE_SIZE - 1 +1: #ifdef HARVARD_CACHE mcr p15, 0, r0, c7, c14, 1 @ clean & invalidate D line #else @@ -276,6 +284,10 @@ ENTRY(v6_dma_flush_range) #endif add r0, r0, #D_CACHE_LINE_SIZE cmp r0, r1 +#ifdef CONFIG_DMA_CACHE_RWFO + ldrlob r2, [r0] @ read for ownership + strlob r2, [r0] @ write for ownership +#endif blo 1b mov r0, #0 mcr p15, 0, r0, c7, c10, 4 @ drain write buffer -- 1.6.0.6 ^ permalink raw reply related [flat|nested] 24+ messages in thread
* [PATCH] ARM: V6 MPCore v6_dma_inv_range and v6_dma_flush_range RWFO fix 2010-11-24 18:39 ` [PATCH] ARM: V6 MPCore v6_dma_inv_range and v6_dma_flush_range " Valentine Barshak @ 2010-12-08 10:25 ` Valentine Barshak 2010-12-09 16:04 ` Catalin Marinas 2010-12-10 5:26 ` George G. Davis 2 siblings, 0 replies; 24+ messages in thread From: Valentine Barshak @ 2010-12-08 10:25 UTC (permalink / raw) To: linux-arm-kernel Valentine Barshak wrote: > Updated according to the comments to avoid r/w outside the buffer and > used byte r/w for the possible unaligned data. Seems to work fine. > > Any comments on this one? Thanks, Val. > Cache ownership must be acqired by reading/writing data from the > cache line to make cache operation have the desired effect on the > SMP MPCore CPU. However, the ownership is never aquired in the > v6_dma_inv_range function when cleaning the first line and > flushing the last one, in case the address is not aligned > to D_CACHE_LINE_SIZE boundary. > Fix this by reading/writing data if needed, before performing > cache operations. > While at it, fix v6_dma_flush_range to prevent RWFO outside > the buffer. > > Signed-off-by: Valentine Barshak <vbarshak@mvista.com> > Signed-off-by: George G. Davis <gdavis@mvista.com> ^ permalink raw reply [flat|nested] 24+ messages in thread
* [PATCH] ARM: V6 MPCore v6_dma_inv_range and v6_dma_flush_range RWFO fix 2010-11-24 18:39 ` [PATCH] ARM: V6 MPCore v6_dma_inv_range and v6_dma_flush_range " Valentine Barshak 2010-12-08 10:25 ` Valentine Barshak @ 2010-12-09 16:04 ` Catalin Marinas 2010-12-09 17:04 ` Valentine Barshak 2010-12-09 22:07 ` Stephen Boyd 2010-12-10 5:26 ` George G. Davis 2 siblings, 2 replies; 24+ messages in thread From: Catalin Marinas @ 2010-12-09 16:04 UTC (permalink / raw) To: linux-arm-kernel On 24 November 2010 18:39, Valentine Barshak <vbarshak@mvista.com> wrote: > Updated according to the comments to avoid r/w outside the buffer and > used byte r/w for the possible unaligned data. Seems to work fine. > > Cache ownership must be acqired by reading/writing data from the > cache line to make cache operation have the desired effect on the > SMP MPCore CPU. However, the ownership is never aquired in the > v6_dma_inv_range function when cleaning the first line and > flushing the last one, in case the address is not aligned > to D_CACHE_LINE_SIZE boundary. > Fix this by reading/writing data if needed, before performing > cache operations. > While at it, fix v6_dma_flush_range to prevent RWFO outside > the buffer. > > Signed-off-by: Valentine Barshak <vbarshak@mvista.com> > Signed-off-by: George G. Davis <gdavis@mvista.com> I eventually found a bit of time to look at this. The patch looks fine to me: Acked-by: Catalin Marinas <catalin.marinas@arm.com> ^ permalink raw reply [flat|nested] 24+ messages in thread
* [PATCH] ARM: V6 MPCore v6_dma_inv_range and v6_dma_flush_range RWFO fix 2010-12-09 16:04 ` Catalin Marinas @ 2010-12-09 17:04 ` Valentine Barshak 2010-12-09 22:07 ` Stephen Boyd 1 sibling, 0 replies; 24+ messages in thread From: Valentine Barshak @ 2010-12-09 17:04 UTC (permalink / raw) To: linux-arm-kernel Catalin Marinas wrote: > On 24 November 2010 18:39, Valentine Barshak <vbarshak@mvista.com> wrote: > >> Updated according to the comments to avoid r/w outside the buffer and >> used byte r/w for the possible unaligned data. Seems to work fine. >> >> Cache ownership must be acqired by reading/writing data from the >> cache line to make cache operation have the desired effect on the >> SMP MPCore CPU. However, the ownership is never aquired in the >> v6_dma_inv_range function when cleaning the first line and >> flushing the last one, in case the address is not aligned >> to D_CACHE_LINE_SIZE boundary. >> Fix this by reading/writing data if needed, before performing >> cache operations. >> While at it, fix v6_dma_flush_range to prevent RWFO outside >> the buffer. >> >> Signed-off-by: Valentine Barshak <vbarshak@mvista.com> >> Signed-off-by: George G. Davis <gdavis@mvista.com> >> > > I eventually found a bit of time to look at this. The patch looks fine to me: > > Acked-by: Catalin Marinas <catalin.marinas@arm.com> > Thanks! Regards, Val. ^ permalink raw reply [flat|nested] 24+ messages in thread
* [PATCH] ARM: V6 MPCore v6_dma_inv_range and v6_dma_flush_range RWFO fix 2010-12-09 16:04 ` Catalin Marinas 2010-12-09 17:04 ` Valentine Barshak @ 2010-12-09 22:07 ` Stephen Boyd 2010-12-10 5:44 ` George G. Davis 1 sibling, 1 reply; 24+ messages in thread From: Stephen Boyd @ 2010-12-09 22:07 UTC (permalink / raw) To: linux-arm-kernel On 12/09/2010 08:04 AM, Catalin Marinas wrote: > On 24 November 2010 18:39, Valentine Barshak <vbarshak@mvista.com> wrote: >> Updated according to the comments to avoid r/w outside the buffer and >> used byte r/w for the possible unaligned data. Seems to work fine. >> >> Cache ownership must be acqired by reading/writing data from the >> cache line to make cache operation have the desired effect on the >> SMP MPCore CPU. However, the ownership is never aquired in the >> v6_dma_inv_range function when cleaning the first line and >> flushing the last one, in case the address is not aligned >> to D_CACHE_LINE_SIZE boundary. >> Fix this by reading/writing data if needed, before performing >> cache operations. >> While at it, fix v6_dma_flush_range to prevent RWFO outside >> the buffer. >> >> Signed-off-by: Valentine Barshak <vbarshak@mvista.com> >> Signed-off-by: George G. Davis <gdavis@mvista.com> > I eventually found a bit of time to look at this. The patch looks fine to me: > > Acked-by: Catalin Marinas <catalin.marinas@arm.com> Is this also a stable candidate? At least it applies cleanly to .35 -- Sent by an employee of the Qualcomm Innovation Center, Inc. The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum. ^ permalink raw reply [flat|nested] 24+ messages in thread
* [PATCH] ARM: V6 MPCore v6_dma_inv_range and v6_dma_flush_range RWFO fix 2010-12-09 22:07 ` Stephen Boyd @ 2010-12-10 5:44 ` George G. Davis 0 siblings, 0 replies; 24+ messages in thread From: George G. Davis @ 2010-12-10 5:44 UTC (permalink / raw) To: linux-arm-kernel On Thu, Dec 09, 2010 at 02:07:17PM -0800, Stephen Boyd wrote: > On 12/09/2010 08:04 AM, Catalin Marinas wrote: > > On 24 November 2010 18:39, Valentine Barshak <vbarshak@mvista.com> wrote: > >> Updated according to the comments to avoid r/w outside the buffer and > >> used byte r/w for the possible unaligned data. Seems to work fine. > >> > >> Cache ownership must be acqired by reading/writing data from the > >> cache line to make cache operation have the desired effect on the > >> SMP MPCore CPU. However, the ownership is never aquired in the > >> v6_dma_inv_range function when cleaning the first line and > >> flushing the last one, in case the address is not aligned > >> to D_CACHE_LINE_SIZE boundary. > >> Fix this by reading/writing data if needed, before performing > >> cache operations. > >> While at it, fix v6_dma_flush_range to prevent RWFO outside > >> the buffer. > >> > >> Signed-off-by: Valentine Barshak <vbarshak@mvista.com> > >> Signed-off-by: George G. Davis <gdavis@mvista.com> > > I eventually found a bit of time to look at this. The patch looks fine to me: > > > > Acked-by: Catalin Marinas <catalin.marinas@arm.com> > > Is this also a stable candidate? At least it applies cleanly to .35 I'd say it makes sense to add "Cc: stable at kernel.org" to the commit header before forwarding this to the ARM Linux kernel Patch Tracking System [1] in this case. -- Regards, George [1] http://www.arm.linux.org.uk/developer/patches/info.php > > -- > Sent by an employee of the Qualcomm Innovation Center, Inc. > The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum. ^ permalink raw reply [flat|nested] 24+ messages in thread
* [PATCH] ARM: V6 MPCore v6_dma_inv_range and v6_dma_flush_range RWFO fix 2010-11-24 18:39 ` [PATCH] ARM: V6 MPCore v6_dma_inv_range and v6_dma_flush_range " Valentine Barshak 2010-12-08 10:25 ` Valentine Barshak 2010-12-09 16:04 ` Catalin Marinas @ 2010-12-10 5:26 ` George G. Davis 2010-12-10 9:50 ` Catalin Marinas 2 siblings, 1 reply; 24+ messages in thread From: George G. Davis @ 2010-12-10 5:26 UTC (permalink / raw) To: linux-arm-kernel Hi, Apologies for the delayed reply. On Wed, Nov 24, 2010 at 09:39:36PM +0300, Valentine Barshak wrote: > Updated according to the comments to avoid r/w outside the buffer and > used byte r/w for the possible unaligned data. Seems to work fine. > > Cache ownership must be acqired by reading/writing data from the > cache line to make cache operation have the desired effect on the > SMP MPCore CPU. However, the ownership is never aquired in the > v6_dma_inv_range function when cleaning the first line and > flushing the last one, in case the address is not aligned > to D_CACHE_LINE_SIZE boundary. > Fix this by reading/writing data if needed, before performing > cache operations. > While at it, fix v6_dma_flush_range to prevent RWFO outside > the buffer. > > Signed-off-by: Valentine Barshak <vbarshak@mvista.com> > Signed-off-by: George G. Davis <gdavis@mvista.com> > --- > arch/arm/mm/cache-v6.S | 40 ++++++++++++++++++++++++++-------------- > 1 files changed, 26 insertions(+), 14 deletions(-) > > diff --git a/arch/arm/mm/cache-v6.S b/arch/arm/mm/cache-v6.S > index 99fa688..622f8a1 100644 > --- a/arch/arm/mm/cache-v6.S > +++ b/arch/arm/mm/cache-v6.S > @@ -203,25 +203,29 @@ ENTRY(v6_flush_kern_dcache_area) > * - end - virtual end address of region > */ > v6_dma_inv_range: > - tst r0, #D_CACHE_LINE_SIZE - 1 > - bic r0, r0, #D_CACHE_LINE_SIZE - 1 > -#ifdef HARVARD_CACHE > - mcrne p15, 0, r0, c7, c10, 1 @ clean D line > -#else > - mcrne p15, 0, r0, c7, c11, 1 @ clean unified line > -#endif > tst r1, #D_CACHE_LINE_SIZE - 1 > +#ifdef CONFIG_DMA_CACHE_RWFO > + ldrneb r2, [r1, #-1] @ read for ownership > + strneb r2, [r1, #-1] @ write for ownership > +#endif > bic r1, r1, #D_CACHE_LINE_SIZE - 1 > #ifdef HARVARD_CACHE > mcrne p15, 0, r1, c7, c14, 1 @ clean & invalidate D line > #else > mcrne p15, 0, r1, c7, c15, 1 @ clean & invalidate unified line > #endif > -1: > #ifdef CONFIG_DMA_CACHE_RWFO > - ldr r2, [r0] @ read for ownership > - str r2, [r0] @ write for ownership > + ldrb r2, [r0] @ read for ownership > + strb r2, [r0] @ write for ownership > +#endif > + tst r0, #D_CACHE_LINE_SIZE - 1 > + bic r0, r0, #D_CACHE_LINE_SIZE - 1 > +#ifdef HARVARD_CACHE > + mcrne p15, 0, r0, c7, c10, 1 @ clean D line > +#else > + mcrne p15, 0, r0, c7, c11, 1 @ clean unified line > #endif > +1: > #ifdef HARVARD_CACHE > mcr p15, 0, r0, c7, c6, 1 @ invalidate D line > #else > @@ -229,6 +233,10 @@ v6_dma_inv_range: > #endif > add r0, r0, #D_CACHE_LINE_SIZE > cmp r0, r1 > +#ifdef CONFIG_DMA_CACHE_RWFO > + ldrlo r2, [r0] @ read for ownership > + strlo r2, [r0] @ write for ownership > +#endif > blo 1b > mov r0, #0 > mcr p15, 0, r0, c7, c10, 4 @ drain write buffer Just a minor nit... Any reason for rearranging the order of 'start'/'end' parameters in the above changes? It's not clear that there is any performance benefit by rearranging those parameters and it obfuscates the changes, e.g. this is equivalent to your change above and also much clearer: diff --git a/arch/arm/mm/cache-v6.S b/arch/arm/mm/cache-v6.S index 99fa688..28e4799 100644 --- a/arch/arm/mm/cache-v6.S +++ b/arch/arm/mm/cache-v6.S @@ -203,6 +203,10 @@ ENTRY(v6_flush_kern_dcache_area) * - end - virtual end address of region */ v6_dma_inv_range: +#ifdef CONFIG_DMA_CACHE_RWFO + ldrb r2, [r0] @ read for ownership + strb r2, [r0] @ write for ownership +#endif tst r0, #D_CACHE_LINE_SIZE - 1 bic r0, r0, #D_CACHE_LINE_SIZE - 1 #ifdef HARVARD_CACHE @@ -211,6 +215,10 @@ v6_dma_inv_range: mcrne p15, 0, r0, c7, c11, 1 @ clean unified line #endif tst r1, #D_CACHE_LINE_SIZE - 1 +#ifdef CONFIG_DMA_CACHE_RWFO + ldrneb r2, [r1, #-1] @ read for ownership + strneb r2, [r1, #-1] @ write for ownership +#endif bic r1, r1, #D_CACHE_LINE_SIZE - 1 #ifdef HARVARD_CACHE mcrne p15, 0, r1, c7, c14, 1 @ clean & invalidate D line @@ -218,10 +226,6 @@ v6_dma_inv_range: mcrne p15, 0, r1, c7, c15, 1 @ clean & invalidate unified line #endif 1: -#ifdef CONFIG_DMA_CACHE_RWFO - ldr r2, [r0] @ read for ownership - str r2, [r0] @ write for ownership -#endif #ifdef HARVARD_CACHE mcr p15, 0, r0, c7, c6, 1 @ invalidate D line #else @@ -229,6 +233,10 @@ v6_dma_inv_range: #endif add r0, r0, #D_CACHE_LINE_SIZE cmp r0, r1 +#ifdef CONFIG_DMA_CACHE_RWFO + ldrlo r2, [r0] @ read for ownership + strlo r2, [r0] @ write for ownership +#endif blo 1b mov r0, #0 mcr p15, 0, r0, c7, c10, 4 @ drain write buffer -- Regards, George > @@ -263,12 +271,12 @@ v6_dma_clean_range: > * - end - virtual end address of region > */ > ENTRY(v6_dma_flush_range) > - bic r0, r0, #D_CACHE_LINE_SIZE - 1 > -1: > #ifdef CONFIG_DMA_CACHE_RWFO > - ldr r2, [r0] @ read for ownership > - str r2, [r0] @ write for ownership > + ldrb r2, [r0] @ read for ownership > + strb r2, [r0] @ write for ownership > #endif > + bic r0, r0, #D_CACHE_LINE_SIZE - 1 > +1: > #ifdef HARVARD_CACHE > mcr p15, 0, r0, c7, c14, 1 @ clean & invalidate D line > #else > @@ -276,6 +284,10 @@ ENTRY(v6_dma_flush_range) > #endif > add r0, r0, #D_CACHE_LINE_SIZE > cmp r0, r1 > +#ifdef CONFIG_DMA_CACHE_RWFO > + ldrlob r2, [r0] @ read for ownership > + strlob r2, [r0] @ write for ownership > +#endif > blo 1b > mov r0, #0 > mcr p15, 0, r0, c7, c10, 4 @ drain write buffer > -- > 1.6.0.6 ^ permalink raw reply related [flat|nested] 24+ messages in thread
* [PATCH] ARM: V6 MPCore v6_dma_inv_range and v6_dma_flush_range RWFO fix 2010-12-10 5:26 ` George G. Davis @ 2010-12-10 9:50 ` Catalin Marinas 2010-12-10 12:29 ` Valentine Barshak 0 siblings, 1 reply; 24+ messages in thread From: Catalin Marinas @ 2010-12-10 9:50 UTC (permalink / raw) To: linux-arm-kernel On Fri, 2010-12-10 at 05:26 +0000, George G. Davis wrote: > On Wed, Nov 24, 2010 at 09:39:36PM +0300, Valentine Barshak wrote: > > Updated according to the comments to avoid r/w outside the buffer and > > used byte r/w for the possible unaligned data. Seems to work fine. > > > > Cache ownership must be acqired by reading/writing data from the > > cache line to make cache operation have the desired effect on the > > SMP MPCore CPU. However, the ownership is never aquired in the > > v6_dma_inv_range function when cleaning the first line and > > flushing the last one, in case the address is not aligned > > to D_CACHE_LINE_SIZE boundary. > > Fix this by reading/writing data if needed, before performing > > cache operations. > > While at it, fix v6_dma_flush_range to prevent RWFO outside > > the buffer. > > > > Signed-off-by: Valentine Barshak <vbarshak@mvista.com> > > Signed-off-by: George G. Davis <gdavis@mvista.com> > > --- > > arch/arm/mm/cache-v6.S | 40 ++++++++++++++++++++++++++-------------- > > 1 files changed, 26 insertions(+), 14 deletions(-) > > > > diff --git a/arch/arm/mm/cache-v6.S b/arch/arm/mm/cache-v6.S > > index 99fa688..622f8a1 100644 > > --- a/arch/arm/mm/cache-v6.S > > +++ b/arch/arm/mm/cache-v6.S > > @@ -203,25 +203,29 @@ ENTRY(v6_flush_kern_dcache_area) > > * - end - virtual end address of region > > */ > > v6_dma_inv_range: > > - tst r0, #D_CACHE_LINE_SIZE - 1 > > - bic r0, r0, #D_CACHE_LINE_SIZE - 1 > > -#ifdef HARVARD_CACHE > > - mcrne p15, 0, r0, c7, c10, 1 @ clean D line > > -#else > > - mcrne p15, 0, r0, c7, c11, 1 @ clean unified line > > -#endif > > tst r1, #D_CACHE_LINE_SIZE - 1 > > +#ifdef CONFIG_DMA_CACHE_RWFO > > + ldrneb r2, [r1, #-1] @ read for ownership > > + strneb r2, [r1, #-1] @ write for ownership > > +#endif > > bic r1, r1, #D_CACHE_LINE_SIZE - 1 > > #ifdef HARVARD_CACHE > > mcrne p15, 0, r1, c7, c14, 1 @ clean & invalidate D line > > #else > > mcrne p15, 0, r1, c7, c15, 1 @ clean & invalidate unified line > > #endif > > -1: > > #ifdef CONFIG_DMA_CACHE_RWFO > > - ldr r2, [r0] @ read for ownership > > - str r2, [r0] @ write for ownership > > + ldrb r2, [r0] @ read for ownership > > + strb r2, [r0] @ write for ownership > > +#endif > > + tst r0, #D_CACHE_LINE_SIZE - 1 > > + bic r0, r0, #D_CACHE_LINE_SIZE - 1 > > +#ifdef HARVARD_CACHE > > + mcrne p15, 0, r0, c7, c10, 1 @ clean D line > > +#else > > + mcrne p15, 0, r0, c7, c11, 1 @ clean unified line > > #endif > > +1: > > #ifdef HARVARD_CACHE > > mcr p15, 0, r0, c7, c6, 1 @ invalidate D line > > #else > > @@ -229,6 +233,10 @@ v6_dma_inv_range: > > #endif > > add r0, r0, #D_CACHE_LINE_SIZE > > cmp r0, r1 > > +#ifdef CONFIG_DMA_CACHE_RWFO > > + ldrlo r2, [r0] @ read for ownership > > + strlo r2, [r0] @ write for ownership > > +#endif > > blo 1b > > mov r0, #0 > > mcr p15, 0, r0, c7, c10, 4 @ drain write buffer > > Just a minor nit... > > Any reason for rearranging the order of 'start'/'end' parameters in the > above changes? It's not clear that there is any performance benefit by > rearranging those parameters and it obfuscates the changes, e.g. this > is equivalent to your change above and also much clearer: I've been wondering about this as well. But I was too lazy to rewrite it and see whether there is a performance benefit in the original patch. I agree that yours looks cleaner. -- Catalin ^ permalink raw reply [flat|nested] 24+ messages in thread
* [PATCH] ARM: V6 MPCore v6_dma_inv_range and v6_dma_flush_range RWFO fix 2010-12-10 9:50 ` Catalin Marinas @ 2010-12-10 12:29 ` Valentine Barshak 2010-12-10 16:49 ` George G. Davis 0 siblings, 1 reply; 24+ messages in thread From: Valentine Barshak @ 2010-12-10 12:29 UTC (permalink / raw) To: linux-arm-kernel Catalin Marinas wrote: > On Fri, 2010-12-10 at 05:26 +0000, George G. Davis wrote: > >> On Wed, Nov 24, 2010 at 09:39:36PM +0300, Valentine Barshak wrote: >> >>> Updated according to the comments to avoid r/w outside the buffer and >>> used byte r/w for the possible unaligned data. Seems to work fine. >>> >>> Cache ownership must be acqired by reading/writing data from the >>> cache line to make cache operation have the desired effect on the >>> SMP MPCore CPU. However, the ownership is never aquired in the >>> v6_dma_inv_range function when cleaning the first line and >>> flushing the last one, in case the address is not aligned >>> to D_CACHE_LINE_SIZE boundary. >>> Fix this by reading/writing data if needed, before performing >>> cache operations. >>> While at it, fix v6_dma_flush_range to prevent RWFO outside >>> the buffer. >>> >>> Signed-off-by: Valentine Barshak <vbarshak@mvista.com> >>> Signed-off-by: George G. Davis <gdavis@mvista.com> >>> --- >>> arch/arm/mm/cache-v6.S | 40 ++++++++++++++++++++++++++-------------- >>> 1 files changed, 26 insertions(+), 14 deletions(-) >>> >>> diff --git a/arch/arm/mm/cache-v6.S b/arch/arm/mm/cache-v6.S >>> index 99fa688..622f8a1 100644 >>> --- a/arch/arm/mm/cache-v6.S >>> +++ b/arch/arm/mm/cache-v6.S >>> @@ -203,25 +203,29 @@ ENTRY(v6_flush_kern_dcache_area) >>> * - end - virtual end address of region >>> */ >>> v6_dma_inv_range: >>> - tst r0, #D_CACHE_LINE_SIZE - 1 >>> - bic r0, r0, #D_CACHE_LINE_SIZE - 1 >>> -#ifdef HARVARD_CACHE >>> - mcrne p15, 0, r0, c7, c10, 1 @ clean D line >>> -#else >>> - mcrne p15, 0, r0, c7, c11, 1 @ clean unified line >>> -#endif >>> tst r1, #D_CACHE_LINE_SIZE - 1 >>> +#ifdef CONFIG_DMA_CACHE_RWFO >>> + ldrneb r2, [r1, #-1] @ read for ownership >>> + strneb r2, [r1, #-1] @ write for ownership >>> +#endif >>> bic r1, r1, #D_CACHE_LINE_SIZE - 1 >>> #ifdef HARVARD_CACHE >>> mcrne p15, 0, r1, c7, c14, 1 @ clean & invalidate D line >>> #else >>> mcrne p15, 0, r1, c7, c15, 1 @ clean & invalidate unified line >>> #endif >>> -1: >>> #ifdef CONFIG_DMA_CACHE_RWFO >>> - ldr r2, [r0] @ read for ownership >>> - str r2, [r0] @ write for ownership >>> + ldrb r2, [r0] @ read for ownership >>> + strb r2, [r0] @ write for ownership >>> +#endif >>> + tst r0, #D_CACHE_LINE_SIZE - 1 >>> + bic r0, r0, #D_CACHE_LINE_SIZE - 1 >>> +#ifdef HARVARD_CACHE >>> + mcrne p15, 0, r0, c7, c10, 1 @ clean D line >>> +#else >>> + mcrne p15, 0, r0, c7, c11, 1 @ clean unified line >>> #endif >>> +1: >>> #ifdef HARVARD_CACHE >>> mcr p15, 0, r0, c7, c6, 1 @ invalidate D line >>> #else >>> @@ -229,6 +233,10 @@ v6_dma_inv_range: >>> #endif >>> add r0, r0, #D_CACHE_LINE_SIZE >>> cmp r0, r1 >>> +#ifdef CONFIG_DMA_CACHE_RWFO >>> + ldrlo r2, [r0] @ read for ownership >>> + strlo r2, [r0] @ write for ownership >>> +#endif >>> blo 1b >>> mov r0, #0 >>> mcr p15, 0, r0, c7, c10, 4 @ drain write buffer >>> >> Just a minor nit... >> >> Any reason for rearranging the order of 'start'/'end' parameters in the >> above changes? It's not clear that there is any performance benefit by >> rearranging those parameters and it obfuscates the changes, e.g. this >> is equivalent to your change above and also much clearer: >> > > It probably obfuscates the change, but I thought it made the code a bit cleaner and easier to read. So that first the end (r1) is flushed and then we do all the operations needed on the first cache line (r0) in a row. I'll resubmit the change and add "Cc: stable at kernel.org" to the header. Thanks, Val. > I've been wondering about this as well. But I was too lazy to rewrite it > and see whether there is a performance benefit in the original patch. I > agree that yours looks cleaner. > > ^ permalink raw reply [flat|nested] 24+ messages in thread
* [PATCH] ARM: V6 MPCore v6_dma_inv_range and v6_dma_flush_range RWFO fix 2010-12-10 12:29 ` Valentine Barshak @ 2010-12-10 16:49 ` George G. Davis 0 siblings, 0 replies; 24+ messages in thread From: George G. Davis @ 2010-12-10 16:49 UTC (permalink / raw) To: linux-arm-kernel On Fri, Dec 10, 2010 at 03:29:09PM +0300, Valentine Barshak wrote: > Catalin Marinas wrote: > >On Fri, 2010-12-10 at 05:26 +0000, George G. Davis wrote: > >>On Wed, Nov 24, 2010 at 09:39:36PM +0300, Valentine Barshak wrote: > >>>Updated according to the comments to avoid r/w outside the buffer and > >>>used byte r/w for the possible unaligned data. Seems to work fine. > >>> > >>>Cache ownership must be acqired by reading/writing data from the > >>>cache line to make cache operation have the desired effect on the > >>>SMP MPCore CPU. However, the ownership is never aquired in the > >>>v6_dma_inv_range function when cleaning the first line and > >>>flushing the last one, in case the address is not aligned > >>>to D_CACHE_LINE_SIZE boundary. > >>>Fix this by reading/writing data if needed, before performing > >>>cache operations. > >>>While at it, fix v6_dma_flush_range to prevent RWFO outside > >>>the buffer. > >>> > >>>Signed-off-by: Valentine Barshak <vbarshak@mvista.com> > >>>Signed-off-by: George G. Davis <gdavis@mvista.com> > >>>--- > >>> arch/arm/mm/cache-v6.S | 40 ++++++++++++++++++++++++++-------------- > >>> 1 files changed, 26 insertions(+), 14 deletions(-) > >>> > >>>diff --git a/arch/arm/mm/cache-v6.S b/arch/arm/mm/cache-v6.S > >>>index 99fa688..622f8a1 100644 > >>>--- a/arch/arm/mm/cache-v6.S > >>>+++ b/arch/arm/mm/cache-v6.S > >>>@@ -203,25 +203,29 @@ ENTRY(v6_flush_kern_dcache_area) > >>> * - end - virtual end address of region > >>> */ > >>> v6_dma_inv_range: > >>>- tst r0, #D_CACHE_LINE_SIZE - 1 > >>>- bic r0, r0, #D_CACHE_LINE_SIZE - 1 > >>>-#ifdef HARVARD_CACHE > >>>- mcrne p15, 0, r0, c7, c10, 1 @ clean D line > >>>-#else > >>>- mcrne p15, 0, r0, c7, c11, 1 @ clean unified line > >>>-#endif > >>> tst r1, #D_CACHE_LINE_SIZE - 1 > >>>+#ifdef CONFIG_DMA_CACHE_RWFO > >>>+ ldrneb r2, [r1, #-1] @ read for ownership > >>>+ strneb r2, [r1, #-1] @ write for ownership > >>>+#endif > >>> bic r1, r1, #D_CACHE_LINE_SIZE - 1 > >>> #ifdef HARVARD_CACHE > >>> mcrne p15, 0, r1, c7, c14, 1 @ clean & invalidate D line > >>> #else > >>> mcrne p15, 0, r1, c7, c15, 1 @ clean & invalidate unified line > >>> #endif > >>>-1: > >>> #ifdef CONFIG_DMA_CACHE_RWFO > >>>- ldr r2, [r0] @ read for ownership > >>>- str r2, [r0] @ write for ownership > >>>+ ldrb r2, [r0] @ read for ownership > >>>+ strb r2, [r0] @ write for ownership > >>>+#endif > >>>+ tst r0, #D_CACHE_LINE_SIZE - 1 > >>>+ bic r0, r0, #D_CACHE_LINE_SIZE - 1 > >>>+#ifdef HARVARD_CACHE > >>>+ mcrne p15, 0, r0, c7, c10, 1 @ clean D line > >>>+#else > >>>+ mcrne p15, 0, r0, c7, c11, 1 @ clean unified line > >>> #endif > >>>+1: > >>> #ifdef HARVARD_CACHE > >>> mcr p15, 0, r0, c7, c6, 1 @ invalidate D line > >>> #else > >>>@@ -229,6 +233,10 @@ v6_dma_inv_range: > >>> #endif > >>> add r0, r0, #D_CACHE_LINE_SIZE > >>> cmp r0, r1 > >>>+#ifdef CONFIG_DMA_CACHE_RWFO > >>>+ ldrlo r2, [r0] @ read for ownership > >>>+ strlo r2, [r0] @ write for ownership > >>>+#endif > >>> blo 1b > >>> mov r0, #0 > >>> mcr p15, 0, r0, c7, c10, 4 @ drain write buffer > >>Just a minor nit... > >> > >>Any reason for rearranging the order of 'start'/'end' parameters in the > >>above changes? It's not clear that there is any performance benefit by > >>rearranging those parameters and it obfuscates the changes, e.g. this > >>is equivalent to your change above and also much clearer: > > > It probably obfuscates the change, but I thought it made the code a > bit cleaner and easier to read. > So that first the end (r1) is flushed and then we do all the > operations needed on the first cache line (r0) in a row. OK, I see your rationale now for moving start (r0) alignment and RWFO fixups just prior to the "clean & invalidate" loop to make the code clearer to the casual reader that RWFO for the first line is done outside the loop and conditionally for remaining lines at the end of the loop. That's admittedly subtle but I'd still prefer to see small incremental changes for faster review (or more fully describe those subtle changes in the commit header). Alas, without comments in the code or commit header, the subtleness of those RWFOs won't be any clearer regardless of their spatial locality. You can always submit another patch to rearrange the code for better flow/readability later. > I'll resubmit the change and add "Cc: stable at kernel.org" to the header. Thanks! -- Regards, George > > Thanks, > Val. > >I've been wondering about this as well. But I was too lazy to rewrite it > >and see whether there is a performance benefit in the original patch. I > >agree that yours looks cleaner. > > ^ permalink raw reply [flat|nested] 24+ messages in thread
* [PATCH] ARM: V6 MPCore v6_dma_inv_range and v6_dma_flush_range RWFO fix 2010-11-24 18:07 ` Catalin Marinas 2010-11-24 18:39 ` [PATCH] ARM: V6 MPCore v6_dma_inv_range and v6_dma_flush_range " Valentine Barshak @ 2010-11-29 15:28 ` Valentine Barshak 1 sibling, 0 replies; 24+ messages in thread From: Valentine Barshak @ 2010-11-29 15:28 UTC (permalink / raw) To: linux-arm-kernel Resending this one. Somehow it got lost and never appeared on the list. Updated according to the comments to avoid r/w outside the buffer and used byte r/w for the possible unaligned data. Seems to work fine. Cache ownership must be acqired by reading/writing data from the cache line to make cache operation have the desired effect on the SMP MPCore CPU. However, the ownership is never aquired in the v6_dma_inv_range function when cleaning the first line and flushing the last one, in case the address is not aligned to D_CACHE_LINE_SIZE boundary. Fix this by reading/writing data if needed, before performing cache operations. While at it, fix v6_dma_flush_range to prevent RWFO outside the buffer. Signed-off-by: Valentine Barshak <vbarshak@mvista.com> Signed-off-by: George G. Davis <gdavis@mvista.com> diff --git a/arch/arm/mm/cache-v6.S b/arch/arm/mm/cache-v6.S index 99fa688..622f8a1 100644 --- a/arch/arm/mm/cache-v6.S +++ b/arch/arm/mm/cache-v6.S @@ -203,25 +203,29 @@ ENTRY(v6_flush_kern_dcache_area) * - end - virtual end address of region */ v6_dma_inv_range: - tst r0, #D_CACHE_LINE_SIZE - 1 - bic r0, r0, #D_CACHE_LINE_SIZE - 1 -#ifdef HARVARD_CACHE - mcrne p15, 0, r0, c7, c10, 1 @ clean D line -#else - mcrne p15, 0, r0, c7, c11, 1 @ clean unified line -#endif tst r1, #D_CACHE_LINE_SIZE - 1 +#ifdef CONFIG_DMA_CACHE_RWFO + ldrneb r2, [r1, #-1] @ read for ownership + strneb r2, [r1, #-1] @ write for ownership +#endif bic r1, r1, #D_CACHE_LINE_SIZE - 1 #ifdef HARVARD_CACHE mcrne p15, 0, r1, c7, c14, 1 @ clean & invalidate D line #else mcrne p15, 0, r1, c7, c15, 1 @ clean & invalidate unified line #endif -1: #ifdef CONFIG_DMA_CACHE_RWFO - ldr r2, [r0] @ read for ownership - str r2, [r0] @ write for ownership + ldrb r2, [r0] @ read for ownership + strb r2, [r0] @ write for ownership +#endif + tst r0, #D_CACHE_LINE_SIZE - 1 + bic r0, r0, #D_CACHE_LINE_SIZE - 1 +#ifdef HARVARD_CACHE + mcrne p15, 0, r0, c7, c10, 1 @ clean D line +#else + mcrne p15, 0, r0, c7, c11, 1 @ clean unified line #endif +1: #ifdef HARVARD_CACHE mcr p15, 0, r0, c7, c6, 1 @ invalidate D line #else @@ -229,6 +233,10 @@ v6_dma_inv_range: #endif add r0, r0, #D_CACHE_LINE_SIZE cmp r0, r1 +#ifdef CONFIG_DMA_CACHE_RWFO + ldrlo r2, [r0] @ read for ownership + strlo r2, [r0] @ write for ownership +#endif blo 1b mov r0, #0 mcr p15, 0, r0, c7, c10, 4 @ drain write buffer @@ -263,12 +271,12 @@ v6_dma_clean_range: * - end - virtual end address of region */ ENTRY(v6_dma_flush_range) - bic r0, r0, #D_CACHE_LINE_SIZE - 1 -1: #ifdef CONFIG_DMA_CACHE_RWFO - ldr r2, [r0] @ read for ownership - str r2, [r0] @ write for ownership + ldrb r2, [r0] @ read for ownership + strb r2, [r0] @ write for ownership #endif + bic r0, r0, #D_CACHE_LINE_SIZE - 1 +1: #ifdef HARVARD_CACHE mcr p15, 0, r0, c7, c14, 1 @ clean & invalidate D line #else @@ -276,6 +284,10 @@ ENTRY(v6_dma_flush_range) #endif add r0, r0, #D_CACHE_LINE_SIZE cmp r0, r1 +#ifdef CONFIG_DMA_CACHE_RWFO + ldrlob r2, [r0] @ read for ownership + strlob r2, [r0] @ write for ownership +#endif blo 1b mov r0, #0 mcr p15, 0, r0, c7, c10, 4 @ drain write buffer ^ permalink raw reply related [flat|nested] 24+ messages in thread
* [PATCH] ARM: V6 MPCore v6_dma_inv_range RWFO fix 2010-11-23 22:42 ` Russell King - ARM Linux 2010-11-24 0:24 ` George G. Davis 2010-11-24 10:42 ` Catalin Marinas @ 2010-11-24 17:40 ` Valentine Barshak 2 siblings, 0 replies; 24+ messages in thread From: Valentine Barshak @ 2010-11-24 17:40 UTC (permalink / raw) To: linux-arm-kernel Russell King - ARM Linux wrote: > On Wed, Nov 24, 2010 at 01:28:06AM +0300, Valentine Barshak wrote: > >> Cache ownership must be acqired by reading/writing data from the >> cache line to make cache operation have the desired effect on the >> SMP MPCore CPU. However, the ownership is never aquired in the >> v6_dma_inv_range function when cleaning the first line and >> flushing the last one, in case the address is not aligned >> to D_CACHE_LINE_SIZE boundary. >> Fix this by reading/writing data if needed, before performing >> cache operations. >> > > You should do this on the data _inside_ the requested buffer. We don't > know if the overlapping cache line shares itself with some atomic > variable, and doing a read-write on it could undo other updates to it. > The same problem could hit us in v6_dma_flush_range as well, when we do r/w after bic. Thanks, Val. ^ permalink raw reply [flat|nested] 24+ messages in thread
end of thread, other threads:[~2010-12-10 16:49 UTC | newest] Thread overview: 24+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2010-11-23 22:28 [PATCH] ARM: V6 MPCore v6_dma_inv_range RWFO fix Valentine Barshak 2010-11-23 22:42 ` Russell King - ARM Linux 2010-11-24 0:24 ` George G. Davis 2010-11-24 10:32 ` Catalin Marinas 2010-11-24 15:14 ` George G. Davis 2010-11-24 10:42 ` Catalin Marinas 2010-11-24 15:10 ` George G. Davis 2010-11-24 15:35 ` Catalin Marinas 2010-11-24 17:33 ` Russell King - ARM Linux 2010-11-24 17:46 ` Catalin Marinas 2010-11-24 18:01 ` Russell King - ARM Linux 2010-11-24 18:07 ` Catalin Marinas 2010-11-24 18:39 ` [PATCH] ARM: V6 MPCore v6_dma_inv_range and v6_dma_flush_range " Valentine Barshak 2010-12-08 10:25 ` Valentine Barshak 2010-12-09 16:04 ` Catalin Marinas 2010-12-09 17:04 ` Valentine Barshak 2010-12-09 22:07 ` Stephen Boyd 2010-12-10 5:44 ` George G. Davis 2010-12-10 5:26 ` George G. Davis 2010-12-10 9:50 ` Catalin Marinas 2010-12-10 12:29 ` Valentine Barshak 2010-12-10 16:49 ` George G. Davis 2010-11-29 15:28 ` Valentine Barshak 2010-11-24 17:40 ` [PATCH] ARM: V6 MPCore v6_dma_inv_range " Valentine Barshak
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).