public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
* [PATCH][RFC] asm-generic:remove calling flush_write_buffers() in dma_sync_*_for_cpu
@ 2009-06-28 14:39 tom.leiming
  2009-06-28 15:34 ` Arnd Bergmann
  0 siblings, 1 reply; 28+ messages in thread
From: tom.leiming @ 2009-06-28 14:39 UTC (permalink / raw)
  To: joerg.roedel, fujita.tomonori; +Cc: linux-kernel, akpm, Ming Lei

From: Ming Lei <tom.leiming@gmail.com>

dma_sync_*_for_cpu() is introduced to make cpu access dma buffers safely when
dma transfer is over, it seems there is nothing to do with cpu write buffer,
so remove it.

Signed-off-by: Ming Lei <tom.leiming@gmail.com>
---
 include/asm-generic/dma-mapping-common.h |    3 ---
 1 files changed, 0 insertions(+), 3 deletions(-)

diff --git a/include/asm-generic/dma-mapping-common.h b/include/asm-generic/dma-mapping-common.h
index 5406a60..73411ab 100644
--- a/include/asm-generic/dma-mapping-common.h
+++ b/include/asm-generic/dma-mapping-common.h
@@ -103,7 +103,6 @@ static inline void dma_sync_single_for_cpu(struct device *dev, dma_addr_t addr,
 	if (ops->sync_single_for_cpu)
 		ops->sync_single_for_cpu(dev, addr, size, dir);
 	debug_dma_sync_single_for_cpu(dev, addr, size, dir);
-	flush_write_buffers();
 }
 
 static inline void dma_sync_single_for_device(struct device *dev,
@@ -132,7 +131,6 @@ static inline void dma_sync_single_range_for_cpu(struct device *dev,
 		ops->sync_single_range_for_cpu(dev, addr, offset, size, dir);
 		debug_dma_sync_single_range_for_cpu(dev, addr, offset, size, dir);
 
-		flush_write_buffers();
 	} else
 		dma_sync_single_for_cpu(dev, addr, size, dir);
 }
@@ -165,7 +163,6 @@ dma_sync_sg_for_cpu(struct device *dev, struct scatterlist *sg,
 	if (ops->sync_sg_for_cpu)
 		ops->sync_sg_for_cpu(dev, sg, nelems, dir);
 	debug_dma_sync_sg_for_cpu(dev, sg, nelems, dir);
-	flush_write_buffers();
 }
 
 static inline void
-- 
1.6.0.GIT


^ permalink raw reply related	[flat|nested] 28+ messages in thread

* Re: [PATCH][RFC] asm-generic:remove calling flush_write_buffers() in dma_sync_*_for_cpu
  2009-06-28 14:39 [PATCH][RFC] asm-generic:remove calling flush_write_buffers() in dma_sync_*_for_cpu tom.leiming
@ 2009-06-28 15:34 ` Arnd Bergmann
  2009-06-29 12:31   ` Joerg Roedel
  0 siblings, 1 reply; 28+ messages in thread
From: Arnd Bergmann @ 2009-06-28 15:34 UTC (permalink / raw)
  To: tom.leiming; +Cc: joerg.roedel, fujita.tomonori, linux-kernel, akpm

On Sunday 28 June 2009 14:39:19 tom.leiming@gmail.com wrote:
> From: Ming Lei <tom.leiming@gmail.com>
> 
> dma_sync_*_for_cpu() is introduced to make cpu access dma buffers safely when
> dma transfer is over, it seems there is nothing to do with cpu write buffer,
> so remove it.
> 
> Signed-off-by: Ming Lei <tom.leiming@gmail.com>

Right, this looks correct. On a related note, flush_write_buffers is
architecture specific right now: only x86 and frv implement it at all,
though and with slightly different semantics.

Maybe it would be more consistent to change the dma_map_* and
dma_sync_*_for_device stuff there to wmb() to make it  portable
to other architectures.

	Arnd <><

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH][RFC] asm-generic:remove calling flush_write_buffers() in dma_sync_*_for_cpu
  2009-06-28 15:34 ` Arnd Bergmann
@ 2009-06-29 12:31   ` Joerg Roedel
  2009-06-29 13:51     ` Ming Lei
  2009-06-29 16:22     ` Arnd Bergmann
  0 siblings, 2 replies; 28+ messages in thread
From: Joerg Roedel @ 2009-06-29 12:31 UTC (permalink / raw)
  To: Arnd Bergmann; +Cc: tom.leiming, fujita.tomonori, linux-kernel, akpm

On Sun, Jun 28, 2009 at 03:34:35PM +0000, Arnd Bergmann wrote:
> On Sunday 28 June 2009 14:39:19 tom.leiming@gmail.com wrote:
> > From: Ming Lei <tom.leiming@gmail.com>
> > 
> > dma_sync_*_for_cpu() is introduced to make cpu access dma buffers safely when
> > dma transfer is over, it seems there is nothing to do with cpu write buffer,
> > so remove it.
> > 
> > Signed-off-by: Ming Lei <tom.leiming@gmail.com>
> 
> Right, this looks correct. On a related note, flush_write_buffers is
> architecture specific right now: only x86 and frv implement it at all,
> though and with slightly different semantics.

This doen't look correct to me. The sync functions may do bounce buffering
which is all about copying data from one place in main memory to another. So we
need these flush_write_buffer() calls in the _for_cpu path too.

> Maybe it would be more consistent to change the dma_map_* and
> dma_sync_*_for_device stuff there to wmb() to make it  portable
> to other architectures.

If we change it to wmb() it would be executed every time there even if the
processor doesn't require it. Other architectures could simply add a
flush_write_buffers() implemention if they want to adapt the common dma-mapping
implementation, no?

	Joerg

-- 
           | Advanced Micro Devices GmbH
 Operating | Karl-Hammerschmidt-Str. 34, 85609 Dornach bei München
 System    | 
 Research  | Geschäftsführer: Thomas M. McCoy, Giuliano Meroni
 Center    | Sitz: Dornach, Gemeinde Aschheim, Landkreis München
           | Registergericht München, HRB Nr. 43632


^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH][RFC] asm-generic:remove calling flush_write_buffers() in  dma_sync_*_for_cpu
  2009-06-29 12:31   ` Joerg Roedel
@ 2009-06-29 13:51     ` Ming Lei
  2009-06-29 14:45       ` Joerg Roedel
  2009-06-29 16:22     ` Arnd Bergmann
  1 sibling, 1 reply; 28+ messages in thread
From: Ming Lei @ 2009-06-29 13:51 UTC (permalink / raw)
  To: Joerg Roedel; +Cc: Arnd Bergmann, fujita.tomonori, linux-kernel, akpm

2009/6/29 Joerg Roedel <joerg.roedel@amd.com>:
> On Sun, Jun 28, 2009 at 03:34:35PM +0000, Arnd Bergmann wrote:
>> On Sunday 28 June 2009 14:39:19 tom.leiming@gmail.com wrote:
>> > From: Ming Lei <tom.leiming@gmail.com>
>> >
>> > dma_sync_*_for_cpu() is introduced to make cpu access dma buffers safely when
>> > dma transfer is over, it seems there is nothing to do with cpu write buffer,
>> > so remove it.
>> >
>> > Signed-off-by: Ming Lei <tom.leiming@gmail.com>
>>
>> Right, this looks correct. On a related note, flush_write_buffers is
>> architecture specific right now: only x86 and frv implement it at all,
>> though and with slightly different semantics.
>
> This doen't look correct to me. The sync functions may do bounce buffering
> which is all about copying data from one place in main memory to another. So we
> need these flush_write_buffer() calls in the _for_cpu path too.

IMHO, even we do not call flush_write_buffer(), CPU can read correct
data from the
dma buffer since write buffer can't affect cache, right?

Thanks.

>
>> Maybe it would be more consistent to change the dma_map_* and
>> dma_sync_*_for_device stuff there to wmb() to make it  portable
>> to other architectures.
>
> If we change it to wmb() it would be executed every time there even if the
> processor doesn't require it. Other architectures could simply add a
> flush_write_buffers() implemention if they want to adapt the common dma-mapping
> implementation, no?
>
>        Joerg
>
> --
>           | Advanced Micro Devices GmbH
>  Operating | Karl-Hammerschmidt-Str. 34, 85609 Dornach bei München
>  System    |
>  Research  | Geschäftsführer: Thomas M. McCoy, Giuliano Meroni
>  Center    | Sitz: Dornach, Gemeinde Aschheim, Landkreis München
>           | Registergericht München, HRB Nr. 43632
>
>



-- 
Lei Ming

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH][RFC] asm-generic:remove calling flush_write_buffers() in dma_sync_*_for_cpu
  2009-06-29 13:51     ` Ming Lei
@ 2009-06-29 14:45       ` Joerg Roedel
  2009-06-29 14:54         ` Ming Lei
  0 siblings, 1 reply; 28+ messages in thread
From: Joerg Roedel @ 2009-06-29 14:45 UTC (permalink / raw)
  To: Ming Lei; +Cc: Arnd Bergmann, fujita.tomonori, linux-kernel, akpm

On Mon, Jun 29, 2009 at 09:51:34PM +0800, Ming Lei wrote:
> 2009/6/29 Joerg Roedel <joerg.roedel@amd.com>:
> > On Sun, Jun 28, 2009 at 03:34:35PM +0000, Arnd Bergmann wrote:
> >> On Sunday 28 June 2009 14:39:19 tom.leiming@gmail.com wrote:
> >> > From: Ming Lei <tom.leiming@gmail.com>
> >> >
> >> > dma_sync_*_for_cpu() is introduced to make cpu access dma buffers safely when
> >> > dma transfer is over, it seems there is nothing to do with cpu write buffer,
> >> > so remove it.
> >> >
> >> > Signed-off-by: Ming Lei <tom.leiming@gmail.com>
> >>
> >> Right, this looks correct. On a related note, flush_write_buffers is
> >> architecture specific right now: only x86 and frv implement it at all,
> >> though and with slightly different semantics.
> >
> > This doen't look correct to me. The sync functions may do bounce buffering
> > which is all about copying data from one place in main memory to another. So we
> > need these flush_write_buffer() calls in the _for_cpu path too.
> 
> IMHO, even we do not call flush_write_buffer(), CPU can read correct
> data from the
> dma buffer since write buffer can't affect cache, right?

flush_write_buffer is not about cache flushing. It is about read/write
reordering in the CPU. Think of it as a memory barrier. On most x86
systems this function is therefore a nop. Cache flushing for
architectures without cache-coherent DMA is typically handled in their
low-level dma-api code (I have seen that at least in parisc32).

	Joerg

-- 
           | Advanced Micro Devices GmbH
 Operating | Karl-Hammerschmidt-Str. 34, 85609 Dornach bei München
 System    | 
 Research  | Geschäftsführer: Thomas M. McCoy, Giuliano Meroni
 Center    | Sitz: Dornach, Gemeinde Aschheim, Landkreis München
           | Registergericht München, HRB Nr. 43632


^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH][RFC] asm-generic:remove calling flush_write_buffers() in  dma_sync_*_for_cpu
  2009-06-29 14:45       ` Joerg Roedel
@ 2009-06-29 14:54         ` Ming Lei
  2009-06-29 15:44           ` Joerg Roedel
  0 siblings, 1 reply; 28+ messages in thread
From: Ming Lei @ 2009-06-29 14:54 UTC (permalink / raw)
  To: Joerg Roedel; +Cc: Arnd Bergmann, fujita.tomonori, linux-kernel, akpm

2009/6/29 Joerg Roedel <joerg.roedel@amd.com>:
>>
>> IMHO, even we do not call flush_write_buffer(), CPU can read correct
>> data from the
>> dma buffer since write buffer can't affect cache, right?
>
> flush_write_buffer is not about cache flushing. It is about read/write
> reordering in the CPU. Think of it as a memory barrier. On most x86

You mean we may need a memory barrier between writing data from bouncing buffer
to dma buffer and reading data from dma buffer, do you?

> systems this function is therefore a nop. Cache flushing for
> architectures without cache-coherent DMA is typically handled in their
> low-level dma-api code (I have seen that at least in parisc32).



-- 
Lei Ming

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH][RFC] asm-generic:remove calling flush_write_buffers() in dma_sync_*_for_cpu
  2009-06-29 14:54         ` Ming Lei
@ 2009-06-29 15:44           ` Joerg Roedel
  0 siblings, 0 replies; 28+ messages in thread
From: Joerg Roedel @ 2009-06-29 15:44 UTC (permalink / raw)
  To: Ming Lei; +Cc: Arnd Bergmann, fujita.tomonori, linux-kernel, akpm

On Mon, Jun 29, 2009 at 10:54:51PM +0800, Ming Lei wrote:
> 2009/6/29 Joerg Roedel <joerg.roedel@amd.com>:
> >>
> >> IMHO, even we do not call flush_write_buffer(), CPU can read correct
> >> data from the
> >> dma buffer since write buffer can't affect cache, right?
> >
> > flush_write_buffer is not about cache flushing. It is about read/write
> > reordering in the CPU. Think of it as a memory barrier. On most x86
> 
> You mean we may need a memory barrier between writing data from bouncing buffer
> to dma buffer and reading data from dma buffer, do you?

CONFIG_X86_OOSTORE (when set flush_write_buffers is a memory barrier and not a
nop on x86) is defined for WINCHIP. So it seems necessary on some chips.

	Joerg

-- 
           | Advanced Micro Devices GmbH
 Operating | Karl-Hammerschmidt-Str. 34, 85609 Dornach bei München
 System    | 
 Research  | Geschäftsführer: Thomas M. McCoy, Giuliano Meroni
 Center    | Sitz: Dornach, Gemeinde Aschheim, Landkreis München
           | Registergericht München, HRB Nr. 43632


^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH][RFC] asm-generic:remove calling flush_write_buffers() in dma_sync_*_for_cpu
  2009-06-29 12:31   ` Joerg Roedel
  2009-06-29 13:51     ` Ming Lei
@ 2009-06-29 16:22     ` Arnd Bergmann
  2009-06-29 16:31       ` Alan Cox
  2009-06-29 18:48       ` Joerg Roedel
  1 sibling, 2 replies; 28+ messages in thread
From: Arnd Bergmann @ 2009-06-29 16:22 UTC (permalink / raw)
  To: Joerg Roedel; +Cc: tom.leiming, fujita.tomonori, linux-kernel, akpm

On Monday 29 June 2009, Joerg Roedel wrote:
> On Sun, Jun 28, 2009 at 03:34:35PM +0000, Arnd Bergmann wrote:
> > On Sunday 28 June 2009 14:39:19 tom.leiming@gmail.com wrote:
> > > From: Ming Lei <tom.leiming@gmail.com>
> > > 
> > > dma_sync_*_for_cpu() is introduced to make cpu access dma buffers safely when
> > > dma transfer is over, it seems there is nothing to do with cpu write buffer,
> > > so remove it.
> > > 
> > > Signed-off-by: Ming Lei <tom.leiming@gmail.com>
> > 
> > Right, this looks correct. On a related note, flush_write_buffers is
> > architecture specific right now: only x86 and frv implement it at all,
> > though and with slightly different semantics.
> 
> This doen't look correct to me. The sync functions may do bounce buffering
> which is all about copying data from one place in main memory to another. So we
> need these flush_write_buffer() calls in the _for_cpu path too.

Right, I didn't consider that.

Wouldn't it be better to put the flush_write_buffer in the specific
operation (swiotlb_sync_*_for_*) rather than the multiplexer?

Maybe in that case, smp_wmb() would be more appropriate because
it is defined on all architectures.

> > Maybe it would be more consistent to change the dma_map_* and
> > dma_sync_*_for_device stuff there to wmb() to make it  portable
> > to other architectures.
> 
> If we change it to wmb() it would be executed every time there even if the
> processor doesn't require it. Other architectures could simply add a
> flush_write_buffers() implemention if they want to adapt the common dma-mapping
> implementation, no?

As mentioned, the definition of flush_write_buffers() seems a little dodgy,
I would feel much more comfortable with putting it into the architecture
specific code or using one of the existing common barriers, since we already
have so many of them.

	Arnd <><

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH][RFC] asm-generic:remove calling flush_write_buffers() in dma_sync_*_for_cpu
  2009-06-29 16:22     ` Arnd Bergmann
@ 2009-06-29 16:31       ` Alan Cox
  2009-06-29 16:45         ` Arnd Bergmann
  2009-06-29 18:48       ` Joerg Roedel
  1 sibling, 1 reply; 28+ messages in thread
From: Alan Cox @ 2009-06-29 16:31 UTC (permalink / raw)
  To: Arnd Bergmann
  Cc: Joerg Roedel, tom.leiming, fujita.tomonori, linux-kernel, akpm

> Wouldn't it be better to put the flush_write_buffer in the specific
> operation (swiotlb_sync_*_for_*) rather than the multiplexer?
> 
> Maybe in that case, smp_wmb() would be more appropriate because
> it is defined on all architectures.

smp_wmb() is stronger and it would slow down x86 if we did that (we'd go
from no-op on a coherent platform to using mfence/lfence etc)


^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH][RFC] asm-generic:remove calling flush_write_buffers() in dma_sync_*_for_cpu
  2009-06-29 16:31       ` Alan Cox
@ 2009-06-29 16:45         ` Arnd Bergmann
  2009-06-29 17:16           ` Alan Cox
  2009-06-29 18:47           ` Joerg Roedel
  0 siblings, 2 replies; 28+ messages in thread
From: Arnd Bergmann @ 2009-06-29 16:45 UTC (permalink / raw)
  To: Alan Cox; +Cc: Joerg Roedel, tom.leiming, fujita.tomonori, linux-kernel, akpm

On Monday 29 June 2009, Alan Cox wrote:
> > Wouldn't it be better to put the flush_write_buffer in the specific
> > operation (swiotlb_sync_*_for_*) rather than the multiplexer?
> > 
> > Maybe in that case, smp_wmb() would be more appropriate because
> > it is defined on all architectures.
> 
> smp_wmb() is stronger and it would slow down x86 if we did that (we'd go
> from no-op on a coherent platform to using mfence/lfence etc)
> 
Really? In my copy of system.h, I read

#ifdef CONFIG_SMP
# ifdef CONFIG_X86_OOSTORE
#  define smp_wmb()      wmb()
# else
#  define smp_wmb()      barrier()
# endif
#else
# define smp_wmb()       barrier()
#endif

That actually looks weaker than flush_write_buffer, as it would turn into
a barrier() in case of !SMP or !X86_OOSTORE, and into an sfence instead of
lock addl on all modern CPUs in case of SMP && X86_OOSTORE.

Of course that raises the question of whether smp_wmb() is too weak in case of
!SMP or X86_PPRO_FENCE, but with the described scenario, I don't think
it does.

	Arnd <><

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH][RFC] asm-generic:remove calling flush_write_buffers() in dma_sync_*_for_cpu
  2009-06-29 16:45         ` Arnd Bergmann
@ 2009-06-29 17:16           ` Alan Cox
  2009-06-30 12:34             ` Arnd Bergmann
  2009-06-29 18:47           ` Joerg Roedel
  1 sibling, 1 reply; 28+ messages in thread
From: Alan Cox @ 2009-06-29 17:16 UTC (permalink / raw)
  To: Arnd Bergmann
  Cc: Joerg Roedel, tom.leiming, fujita.tomonori, linux-kernel, akpm

> Of course that raises the question of whether smp_wmb() is too weak in case of
> !SMP or X86_PPRO_FENCE, but with the described scenario, I don't think
> it does.

I'm pretty sure it is for the Winchip -but we don't care as there are no
SMP ones. OTOH with DMA we do care


^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH][RFC] asm-generic:remove calling flush_write_buffers() in dma_sync_*_for_cpu
  2009-06-29 16:45         ` Arnd Bergmann
  2009-06-29 17:16           ` Alan Cox
@ 2009-06-29 18:47           ` Joerg Roedel
  2009-06-29 19:10             ` Alan Cox
  1 sibling, 1 reply; 28+ messages in thread
From: Joerg Roedel @ 2009-06-29 18:47 UTC (permalink / raw)
  To: Arnd Bergmann; +Cc: Alan Cox, tom.leiming, fujita.tomonori, linux-kernel, akpm

On Mon, Jun 29, 2009 at 06:45:15PM +0200, Arnd Bergmann wrote:
> On Monday 29 June 2009, Alan Cox wrote:
> > > Wouldn't it be better to put the flush_write_buffer in the specific
> > > operation (swiotlb_sync_*_for_*) rather than the multiplexer?
> > > 
> > > Maybe in that case, smp_wmb() would be more appropriate because
> > > it is defined on all architectures.
> > 
> > smp_wmb() is stronger and it would slow down x86 if we did that (we'd go
> > from no-op on a coherent platform to using mfence/lfence etc)
> > 
> Really? In my copy of system.h, I read
> 
> #ifdef CONFIG_SMP
> # ifdef CONFIG_X86_OOSTORE
> #  define smp_wmb()      wmb()
> # else
> #  define smp_wmb()      barrier()
> # endif
> #else
> # define smp_wmb()       barrier()
> #endif

With that definition an smp_wmb() would do the right job on x86. If thats also
true for other architectures using this generic header we can remove
flush_write_buffer().

> 
> That actually looks weaker than flush_write_buffer, as it would turn into
> a barrier() in case of !SMP or !X86_OOSTORE, and into an sfence instead of
> lock addl on all modern CPUs in case of SMP && X86_OOSTORE.

X86_OOSTORE is defined for (MWINCHIP3D || MWINCHIPC6) && MTRR. I am not sure if
these chips have sfence.
It would also help to know what OOSTORE exactly means in the context of that
chip.

	Joerg

-- 
           | Advanced Micro Devices GmbH
 Operating | Karl-Hammerschmidt-Str. 34, 85609 Dornach bei München
 System    | 
 Research  | Geschäftsführer: Thomas M. McCoy, Giuliano Meroni
 Center    | Sitz: Dornach, Gemeinde Aschheim, Landkreis München
           | Registergericht München, HRB Nr. 43632


^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH][RFC] asm-generic:remove calling flush_write_buffers() in dma_sync_*_for_cpu
  2009-06-29 16:22     ` Arnd Bergmann
  2009-06-29 16:31       ` Alan Cox
@ 2009-06-29 18:48       ` Joerg Roedel
  1 sibling, 0 replies; 28+ messages in thread
From: Joerg Roedel @ 2009-06-29 18:48 UTC (permalink / raw)
  To: Arnd Bergmann; +Cc: tom.leiming, fujita.tomonori, linux-kernel, akpm

On Mon, Jun 29, 2009 at 06:22:17PM +0200, Arnd Bergmann wrote:
> As mentioned, the definition of flush_write_buffers() seems a little dodgy,
> I would feel much more comfortable with putting it into the architecture
> specific code or using one of the existing common barriers, since we already
> have so many of them.

Yeah true. We should remove this function if we see that it is not required.

	Joerg

-- 
           | Advanced Micro Devices GmbH
 Operating | Karl-Hammerschmidt-Str. 34, 85609 Dornach bei München
 System    | 
 Research  | Geschäftsführer: Thomas M. McCoy, Giuliano Meroni
 Center    | Sitz: Dornach, Gemeinde Aschheim, Landkreis München
           | Registergericht München, HRB Nr. 43632


^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH][RFC] asm-generic:remove calling flush_write_buffers() in dma_sync_*_for_cpu
  2009-06-29 18:47           ` Joerg Roedel
@ 2009-06-29 19:10             ` Alan Cox
  2009-06-29 19:24               ` Joerg Roedel
  0 siblings, 1 reply; 28+ messages in thread
From: Alan Cox @ 2009-06-29 19:10 UTC (permalink / raw)
  To: Joerg Roedel
  Cc: Arnd Bergmann, tom.leiming, fujita.tomonori, linux-kernel, akpm

> X86_OOSTORE is defined for (MWINCHIP3D || MWINCHIPC6) && MTRR. I am not sure if
> these chips have sfence.
> It would also help to know what OOSTORE exactly means in the context of that
> chip.

Non uncached stores may be visible out of order on the bus.

Alan

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH][RFC] asm-generic:remove calling flush_write_buffers() in dma_sync_*_for_cpu
  2009-06-29 19:10             ` Alan Cox
@ 2009-06-29 19:24               ` Joerg Roedel
  0 siblings, 0 replies; 28+ messages in thread
From: Joerg Roedel @ 2009-06-29 19:24 UTC (permalink / raw)
  To: Alan Cox; +Cc: Arnd Bergmann, tom.leiming, fujita.tomonori, linux-kernel, akpm

On Mon, Jun 29, 2009 at 08:10:26PM +0100, Alan Cox wrote:
> > X86_OOSTORE is defined for (MWINCHIP3D || MWINCHIPC6) && MTRR. I am not sure if
> > these chips have sfence.
> > It would also help to know what OOSTORE exactly means in the context of that
> > chip.
> 
> Non uncached stores may be visible out of order on the bus.

Ah ok, thanks. I think we should document that somewhere. A help text for the
Kconfig entry comes to my mind for that purpose. I will send a patch for this.

	Joerg

-- 
           | Advanced Micro Devices GmbH
 Operating | Karl-Hammerschmidt-Str. 34, 85609 Dornach bei München
 System    | 
 Research  | Geschäftsführer: Thomas M. McCoy, Giuliano Meroni
 Center    | Sitz: Dornach, Gemeinde Aschheim, Landkreis München
           | Registergericht München, HRB Nr. 43632


^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH][RFC] asm-generic:remove calling flush_write_buffers() in dma_sync_*_for_cpu
  2009-06-29 17:16           ` Alan Cox
@ 2009-06-30 12:34             ` Arnd Bergmann
  2009-06-30 12:40               ` Alan Cox
  0 siblings, 1 reply; 28+ messages in thread
From: Arnd Bergmann @ 2009-06-30 12:34 UTC (permalink / raw)
  To: Alan Cox; +Cc: Joerg Roedel, tom.leiming, fujita.tomonori, linux-kernel, akpm

On Monday 29 June 2009, Alan Cox wrote:
> > Of course that raises the question of whether smp_wmb() is too weak in case of
> > !SMP or X86_PPRO_FENCE, but with the described scenario, I don't think
> > it does.
> 
> I'm pretty sure it is for the Winchip -but we don't care as there are no
> SMP ones. OTOH with DMA we do care

Ok. The Winchip also does not have an IOMMU or the need for SWIOTLB, so
I guess it would be ok to move the flush_write_buffers() out of the common
code into the x86 pci-nommu implementation. That one is also the only
place that calls flush_write_buffers() in dma_map_().

This would need to get revisited if we want to implement OOSTORE for
VIA Nano, which could then need it in swiotlb.
---
[PATCH] asm-generic: remove calling flush_write_buffers() in dma_sync_*

Flushing the write buffers is only needed on x86 processors using OOSTORE
(currently only pre-VIA Centaur processors) when giving a buffer to the
device, but not for handing it back. 

Based on an earlier patch from Ming Lei <tom.leiming@gmail.com>

Cc: Joerg Roedel <joerg.roedel@amd.com>
Cc: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
Cc: Alan Cox <alan@lxorguk.ukuu.org.uk>
Signed-off-by: Arnd Bergmann <arnd@arndb.de>

--- a/arch/x86/kernel/pci-nommu.c
+++ b/arch/x86/kernel/pci-nommu.c
@@ -79,12 +79,29 @@ static void nommu_free_coherent(struct device *dev, size_t size, void *vaddr,
 	free_pages((unsigned long)vaddr, get_order(size));
 }
 
+static void nommu_sync_single_for_device(struct device *dev,
+			dma_addr_t addr, size_t size,
+			enum dma_data_direction dir)
+{
+	flush_write_buffers();
+}
+
+
+static void nommu_sync_sg_for_device(struct device *dev,
+			struct scatterlist *sg, int nelems,
+			enum dma_data_direction dir)
+{
+	flush_write_buffers();
+}
+
 struct dma_map_ops nommu_dma_ops = {
-	.alloc_coherent	= dma_generic_alloc_coherent,
-	.free_coherent	= nommu_free_coherent,
-	.map_sg		= nommu_map_sg,
-	.map_page	= nommu_map_page,
-	.is_phys	= 1,
+	.alloc_coherent		= dma_generic_alloc_coherent,
+	.free_coherent		= nommu_free_coherent,
+	.map_sg			= nommu_map_sg,
+	.map_page		= nommu_map_page,
+	.sync_single_for_device = nommu_sync_single_for_device,
+	.sync_sg_for_device	= nommu_sync_sg_for_device,
+	.is_phys		= 1,
 };
 
 void __init no_iommu_init(void)
diff --git a/include/asm-generic/dma-mapping-common.h b/include/asm-generic/dma-mapping-common.h
index 5406a60..e694263 100644
--- a/include/asm-generic/dma-mapping-common.h
+++ b/include/asm-generic/dma-mapping-common.h
@@ -103,7 +103,6 @@ static inline void dma_sync_single_for_cpu(struct device *dev, dma_addr_t addr,
 	if (ops->sync_single_for_cpu)
 		ops->sync_single_for_cpu(dev, addr, size, dir);
 	debug_dma_sync_single_for_cpu(dev, addr, size, dir);
-	flush_write_buffers();
 }
 
 static inline void dma_sync_single_for_device(struct device *dev,
@@ -116,7 +115,6 @@ static inline void dma_sync_single_for_device(struct device *dev,
 	if (ops->sync_single_for_device)
 		ops->sync_single_for_device(dev, addr, size, dir);
 	debug_dma_sync_single_for_device(dev, addr, size, dir);
-	flush_write_buffers();
 }
 
 static inline void dma_sync_single_range_for_cpu(struct device *dev,
@@ -132,7 +130,6 @@ static inline void dma_sync_single_range_for_cpu(struct device *dev,
 		ops->sync_single_range_for_cpu(dev, addr, offset, size, dir);
 		debug_dma_sync_single_range_for_cpu(dev, addr, offset, size, dir);
 
-		flush_write_buffers();
 	} else
 		dma_sync_single_for_cpu(dev, addr, size, dir);
 }
@@ -150,7 +147,6 @@ static inline void dma_sync_single_range_for_device(struct device *dev,
 		ops->sync_single_range_for_device(dev, addr, offset, size, dir);
 		debug_dma_sync_single_range_for_device(dev, addr, offset, size, dir);
 
-		flush_write_buffers();
 	} else
 		dma_sync_single_for_device(dev, addr, size, dir);
 }
@@ -165,7 +161,6 @@ dma_sync_sg_for_cpu(struct device *dev, struct scatterlist *sg,
 	if (ops->sync_sg_for_cpu)
 		ops->sync_sg_for_cpu(dev, sg, nelems, dir);
 	debug_dma_sync_sg_for_cpu(dev, sg, nelems, dir);
-	flush_write_buffers();
 }
 
 static inline void
@@ -179,7 +174,6 @@ dma_sync_sg_for_device(struct device *dev, struct scatterlist *sg,
 		ops->sync_sg_for_device(dev, sg, nelems, dir);
 	debug_dma_sync_sg_for_device(dev, sg, nelems, dir);
 
-	flush_write_buffers();
 }
 
 #define dma_map_single(d, a, s, r) dma_map_single_attrs(d, a, s, r, NULL)

^ permalink raw reply related	[flat|nested] 28+ messages in thread

* Re: [PATCH][RFC] asm-generic:remove calling flush_write_buffers() in dma_sync_*_for_cpu
  2009-06-30 12:34             ` Arnd Bergmann
@ 2009-06-30 12:40               ` Alan Cox
  2009-06-30 12:48                 ` Arnd Bergmann
  0 siblings, 1 reply; 28+ messages in thread
From: Alan Cox @ 2009-06-30 12:40 UTC (permalink / raw)
  To: Arnd Bergmann
  Cc: Joerg Roedel, tom.leiming, fujita.tomonori, linux-kernel, akpm

> Ok. The Winchip also does not have an IOMMU or the need for SWIOTLB, so
> I guess it would be ok to move the flush_write_buffers() out of the common
> code into the x86 pci-nommu implementation. That one is also the only
> place that calls flush_write_buffers() in dma_map_().

What about non x86 - this is asm-generic you are playing with and its the
kind of change that causes evil really hard to track down and subtle
corruptions and user data loss if you get it wrong.


^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH][RFC] asm-generic:remove calling flush_write_buffers() in dma_sync_*_for_cpu
  2009-06-30 12:40               ` Alan Cox
@ 2009-06-30 12:48                 ` Arnd Bergmann
  2009-06-30 13:09                   ` Alan Cox
  0 siblings, 1 reply; 28+ messages in thread
From: Arnd Bergmann @ 2009-06-30 12:48 UTC (permalink / raw)
  To: Alan Cox; +Cc: Joerg Roedel, tom.leiming, fujita.tomonori, linux-kernel, akpm

On Tuesday 30 June 2009, Alan Cox wrote:
> > Ok. The Winchip also does not have an IOMMU or the need for SWIOTLB, so
> > I guess it would be ok to move the flush_write_buffers() out of the common
> > code into the x86 pci-nommu implementation. That one is also the only
> > place that calls flush_write_buffers() in dma_map_().
> 
> What about non x86 - this is asm-generic you are playing with and its the
> kind of change that causes evil really hard to track down and subtle
> corruptions and user data loss if you get it wrong.

Non-x86 is the real motivation for the patch, because the flush_write_buffers
call in this file is currently not implemented and causes build errors
on everything but x86, frv, ia64 and m32r, where the latter two implement
it as an empty macro.

The only users of the file right now are x86 and ia64, and ia64 only added
the empty flush_write_buffers() definition in order to use it.

	Arnd <><

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH][RFC] asm-generic:remove calling flush_write_buffers() in dma_sync_*_for_cpu
  2009-06-30 12:48                 ` Arnd Bergmann
@ 2009-06-30 13:09                   ` Alan Cox
  2009-06-30 13:38                     ` Arnd Bergmann
  0 siblings, 1 reply; 28+ messages in thread
From: Alan Cox @ 2009-06-30 13:09 UTC (permalink / raw)
  To: Arnd Bergmann
  Cc: Joerg Roedel, tom.leiming, fujita.tomonori, linux-kernel, akpm

> > What about non x86 - this is asm-generic you are playing with and its the
> > kind of change that causes evil really hard to track down and subtle
> > corruptions and user data loss if you get it wrong.
> 
> Non-x86 is the real motivation for the patch, because the flush_write_buffers
> call in this file is currently not implemented and causes build errors
> on everything but x86, frv, ia64 and m32r, where the latter two implement
> it as an empty macro.

Ok so the FRV would grow a similar no iommu patch to the x86.

> The only users of the file right now are x86 and ia64, and ia64 only added
> the empty flush_write_buffers() definition in order to use it.

That now all makes sense.

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH][RFC] asm-generic:remove calling flush_write_buffers() in dma_sync_*_for_cpu
  2009-06-30 13:09                   ` Alan Cox
@ 2009-06-30 13:38                     ` Arnd Bergmann
  2009-07-07  1:54                       ` Ming Lei
  0 siblings, 1 reply; 28+ messages in thread
From: Arnd Bergmann @ 2009-06-30 13:38 UTC (permalink / raw)
  To: Alan Cox; +Cc: Joerg Roedel, tom.leiming, fujita.tomonori, linux-kernel, akpm

On Tuesday 30 June 2009, Alan Cox wrote:
> > > What about non x86 - this is asm-generic you are playing with and its the
> > > kind of change that causes evil really hard to track down and subtle
> > > corruptions and user data loss if you get it wrong.
> > 
> > Non-x86 is the real motivation for the patch, because the flush_write_buffers
> > call in this file is currently not implemented and causes build errors
> > on everything but x86, frv, ia64 and m32r, where the latter two implement
> > it as an empty macro.
> 
> Ok so the FRV would grow a similar no iommu patch to the x86.

Well, not even that. dma-mapping-common.h only makes sense on architectures
that have multiple dma-mapping implementations (parisc, mips, arm, powerpc,
sparc, ia64 and x86, possibly alpha). All others including frv only need a
nommu case anyway and would not use dma-mapping-common.h but could be
changed to use something like the dma-mapping-linear.h I worked on recently.

	Arnd <><

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH][RFC] asm-generic:remove calling flush_write_buffers() in  dma_sync_*_for_cpu
  2009-06-30 13:38                     ` Arnd Bergmann
@ 2009-07-07  1:54                       ` Ming Lei
  2009-07-07  7:48                         ` Russell King - ARM Linux
  0 siblings, 1 reply; 28+ messages in thread
From: Ming Lei @ 2009-07-07  1:54 UTC (permalink / raw)
  To: Arnd Bergmann
  Cc: Alan Cox, Joerg Roedel, fujita.tomonori, linux-kernel, akpm,
	Russell King, linux-arm-kernel

2009/6/30 Arnd Bergmann <arnd@arndb.de>:
> On Tuesday 30 June 2009, Alan Cox wrote:
> Well, not even that. dma-mapping-common.h only makes sense on architectures
> that have multiple dma-mapping implementations (parisc, mips, arm, powerpc,

It seems that there is only one dma-mmaping implementation on ARM, doesn't it?
Is it necessary that using dma-mapping-common.h on ARM?

> sparc, ia64 and x86, possibly alpha). All others including frv only need a
> nommu case anyway and would not use dma-mapping-common.h but could be
> changed to use something like the dma-mapping-linear.h I worked on recently.




-- 
Lei Ming

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH][RFC] asm-generic:remove calling flush_write_buffers() in dma_sync_*_for_cpu
  2009-07-07  1:54                       ` Ming Lei
@ 2009-07-07  7:48                         ` Russell King - ARM Linux
  2009-07-07 13:43                           ` Ming Lei
  0 siblings, 1 reply; 28+ messages in thread
From: Russell King - ARM Linux @ 2009-07-07  7:48 UTC (permalink / raw)
  To: Ming Lei
  Cc: Arnd Bergmann, Alan Cox, Joerg Roedel, fujita.tomonori,
	linux-kernel, akpm, linux-arm-kernel

On Tue, Jul 07, 2009 at 09:54:20AM +0800, Ming Lei wrote:
> 2009/6/30 Arnd Bergmann <arnd@arndb.de>:
> > On Tuesday 30 June 2009, Alan Cox wrote:
> > Well, not even that. dma-mapping-common.h only makes sense on architectures
> > that have multiple dma-mapping implementations (parisc, mips, arm, powerpc,
> 
> It seems that there is only one dma-mmaping implementation on ARM, doesn't it?
> Is it necessary that using dma-mapping-common.h on ARM?

ARM has two (normal, and dma bounce), and in the long run we need to do
cache handling on unmap as well as map due to CPU speculative fetches.

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH][RFC] asm-generic:remove calling flush_write_buffers() in  dma_sync_*_for_cpu
  2009-07-07  7:48                         ` Russell King - ARM Linux
@ 2009-07-07 13:43                           ` Ming Lei
  2009-07-07 14:06                             ` Arnd Bergmann
  2009-07-07 17:33                             ` Russell King - ARM Linux
  0 siblings, 2 replies; 28+ messages in thread
From: Ming Lei @ 2009-07-07 13:43 UTC (permalink / raw)
  To: Russell King - ARM Linux
  Cc: Arnd Bergmann, Alan Cox, Joerg Roedel, fujita.tomonori,
	linux-kernel, akpm, linux-arm-kernel

2009/7/7 Russell King - ARM Linux <linux@arm.linux.org.uk>:
> On Tue, Jul 07, 2009 at 09:54:20AM +0800, Ming Lei wrote:
>> 2009/6/30 Arnd Bergmann <arnd@arndb.de>:
>> > On Tuesday 30 June 2009, Alan Cox wrote:
>> > Well, not even that. dma-mapping-common.h only makes sense on architectures
>> > that have multiple dma-mapping implementations (parisc, mips, arm, powerpc,
>>
>> It seems that there is only one dma-mmaping implementation on ARM, doesn't it?
>> Is it necessary that using dma-mapping-common.h on ARM?
>
> ARM has two (normal, and dma bounce), and in the long run we need to do

OK, Can we use dma-mapping-common.h on ARM?

> cache handling on unmap as well as map due to CPU speculative fetches.

IMHO, it seems we can fix this problem now.

For DMA_TO_DEVICE transfer, clean cache in dma map, but does nothing in
dma unmap;

For  DMA_FROM_DEVICE, we may do nothing in dma map, but invaliate cache
in dma unmap.

Is it doable?

Thanks.
-- 
Lei Ming

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH][RFC] asm-generic:remove calling flush_write_buffers() in dma_sync_*_for_cpu
  2009-07-07 13:43                           ` Ming Lei
@ 2009-07-07 14:06                             ` Arnd Bergmann
  2009-07-07 14:55                               ` Ming Lei
  2009-07-07 17:36                               ` Russell King - ARM Linux
  2009-07-07 17:33                             ` Russell King - ARM Linux
  1 sibling, 2 replies; 28+ messages in thread
From: Arnd Bergmann @ 2009-07-07 14:06 UTC (permalink / raw)
  To: Ming Lei
  Cc: Russell King - ARM Linux, Alan Cox, Joerg Roedel, fujita.tomonori,
	linux-kernel, akpm, linux-arm-kernel

On Tuesday 07 July 2009, Ming Lei wrote:
> > ARM has two (normal, and dma bounce), and in the long run we need to do
> 
> OK, Can we use dma-mapping-common.h on ARM?

It should work in principle. It may be a good idea to also move to
the generic swiotlb instead of the traditional dma bounce at the
same time.

Note that dma-mapping-common.h is only needed if you want to support
two or more different DMA implementations in a single kernel, which
I'm not sure is needed for ARM.

> > cache handling on unmap as well as map due to CPU speculative fetches.
> 
> IMHO, it seems we can fix this problem now.
> 
> For DMA_TO_DEVICE transfer, clean cache in dma map, but does nothing in
> dma unmap;
> 
> For  DMA_FROM_DEVICE, we may do nothing in dma map, but invaliate cache
> in dma unmap.

A number of other architectures do this already. You also need to
have dma_sync_*_for_cpu and dma_sync_*_for_device, where the *_for_device
operation needs to do the same flushing as dma_map_* and *_for_cpu
does the same as dma_unmap_*.

Note that actually you need to do writeback+invalidate in DMA_TO_DEVICE
and at least an invalidate in DMA_FROM_DEVICE during dma_map_*.
For the unmap, I don't think you ever need to invalidate the cache.
If you invalidate only at unmap time for DMA_FROM_DEVICE, a dirty
cache line might be accidentally flushed to the buffer after
the device has written to it.

	Arnd <><

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH][RFC] asm-generic:remove calling flush_write_buffers() in  dma_sync_*_for_cpu
  2009-07-07 14:06                             ` Arnd Bergmann
@ 2009-07-07 14:55                               ` Ming Lei
  2009-07-07 15:30                                 ` Arnd Bergmann
  2009-07-07 17:36                               ` Russell King - ARM Linux
  1 sibling, 1 reply; 28+ messages in thread
From: Ming Lei @ 2009-07-07 14:55 UTC (permalink / raw)
  To: Arnd Bergmann
  Cc: Russell King - ARM Linux, Alan Cox, Joerg Roedel, fujita.tomonori,
	linux-kernel, akpm, linux-arm-kernel

2009/7/7 Arnd Bergmann <arnd@arndb.de>:
> On Tuesday 07 July 2009, Ming Lei wrote:
>> > ARM has two (normal, and dma bounce), and in the long run we need to do
>>
>> OK, Can we use dma-mapping-common.h on ARM?
>
> It should work in principle. It may be a good idea to also move to
> the generic swiotlb instead of the traditional dma bounce at the
> same time.
>
> Note that dma-mapping-common.h is only needed if you want to support
> two or more different DMA implementations in a single kernel, which
> I'm not sure is needed for ARM.
>
>> > cache handling on unmap as well as map due to CPU speculative fetches.
>>
>> IMHO, it seems we can fix this problem now.
>>
>> For DMA_TO_DEVICE transfer, clean cache in dma map, but does nothing in
>> dma unmap;
>>
>> For  DMA_FROM_DEVICE, we may do nothing in dma map, but invaliate cache
>> in dma unmap.
>
> A number of other architectures do this already. You also need to
> have dma_sync_*_for_cpu and dma_sync_*_for_device, where the *_for_device
> operation needs to do the same flushing as dma_map_* and *_for_cpu
> does the same as dma_unmap_*.
>
> Note that actually you need to do writeback+invalidate in DMA_TO_DEVICE

IMHO,  writeback in dma map is enough for DMA_TO_DEVICE transfer.

> and at least an invalidate in DMA_FROM_DEVICE during dma_map_*.
> For the unmap, I don't think you ever need to invalidate the cache.

No,  as Russell said that CPU speculative fetches can make cache "dirty" after
dma map but before dma is over, then CPU may read inconsistent data
after DMA_FROM_DEVICE transfer is finished if does not invalidate
cache in
dma unmap.

> If you invalidate only at unmap time for DMA_FROM_DEVICE, a dirty
> cache line might be accidentally flushed to the buffer after
> the device has written to it.

It seems you are reasonable, and we do need to invalidate cache in both dma
map and dma unmap for DMA_FROM_DEVICE, don't we?

-- 
Lei Ming

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH][RFC] asm-generic:remove calling flush_write_buffers() in dma_sync_*_for_cpu
  2009-07-07 14:55                               ` Ming Lei
@ 2009-07-07 15:30                                 ` Arnd Bergmann
  0 siblings, 0 replies; 28+ messages in thread
From: Arnd Bergmann @ 2009-07-07 15:30 UTC (permalink / raw)
  To: Ming Lei
  Cc: Russell King - ARM Linux, Alan Cox, Joerg Roedel, fujita.tomonori,
	linux-kernel, akpm, linux-arm-kernel

On Tuesday 07 July 2009, Ming Lei wrote:
> > Note that actually you need to do writeback+invalidate in DMA_TO_DEVICE
> 
> IMHO,  writeback in dma map is enough for DMA_TO_DEVICE transfer.

Right. writeback+invalidate would only be needed for bidirectional
transfers but not for DMA_TO_DEVICE.

> > and at least an invalidate in DMA_FROM_DEVICE during dma_map_*.
> > For the unmap, I don't think you ever need to invalidate the cache.
> 
> No,  as Russell said that CPU speculative fetches can make cache "dirty" after
> dma map but before dma is over, then CPU may read inconsistent data
> after DMA_FROM_DEVICE transfer is finished if does not invalidate
> cache in
> dma unmap.

Hmm, I don't think any of the other architectures currently
try to prevent this. Is ARM special in this regard, or does that
mean that the others are broken? What kind of code would cause
a speculative read on a data section that the kernel (by convention)
must not access? Would it be enough to have an rmb() in dma_unmap_*
and dma_sync_*_for_cpu to prevent speculative accesses?

> > If you invalidate only at unmap time for DMA_FROM_DEVICE, a dirty
> > cache line might be accidentally flushed to the buffer after
> > the device has written to it.
> 
> It seems you are reasonable, and we do need to invalidate cache in both dma
> map and dma unmap for DMA_FROM_DEVICE, don't we?

Is it guaranteed that a cache line from a speculative fetch will
never be written back? Otherwise it would still be broken.

	Arnd <><

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH][RFC] asm-generic:remove calling flush_write_buffers() in dma_sync_*_for_cpu
  2009-07-07 13:43                           ` Ming Lei
  2009-07-07 14:06                             ` Arnd Bergmann
@ 2009-07-07 17:33                             ` Russell King - ARM Linux
  1 sibling, 0 replies; 28+ messages in thread
From: Russell King - ARM Linux @ 2009-07-07 17:33 UTC (permalink / raw)
  To: Ming Lei
  Cc: Arnd Bergmann, Alan Cox, Joerg Roedel, fujita.tomonori,
	linux-kernel, akpm, linux-arm-kernel

On Tue, Jul 07, 2009 at 09:43:37PM +0800, Ming Lei wrote:
> 2009/7/7 Russell King - ARM Linux <linux@arm.linux.org.uk>:
> > On Tue, Jul 07, 2009 at 09:54:20AM +0800, Ming Lei wrote:
> >> 2009/6/30 Arnd Bergmann <arnd@arndb.de>:
> >> > On Tuesday 30 June 2009, Alan Cox wrote:
> >> > Well, not even that. dma-mapping-common.h only makes sense on architectures
> >> > that have multiple dma-mapping implementations (parisc, mips, arm, powerpc,
> >>
> >> It seems that there is only one dma-mmaping implementation on ARM, doesn't it?
> >> Is it necessary that using dma-mapping-common.h on ARM?
> >
> > ARM has two (normal, and dma bounce), and in the long run we need to do
> 
> OK, Can we use dma-mapping-common.h on ARM?

No.

> > cache handling on unmap as well as map due to CPU speculative fetches.
> 
> IMHO, it seems we can fix this problem now.
> 
> For DMA_TO_DEVICE transfer, clean cache in dma map, but does nothing in
> dma unmap;
> 
> For  DMA_FROM_DEVICE, we may do nothing in dma map, but invaliate cache
> in dma unmap.

No.  If there are dirty lines, they can be written back, overwriting the
data which has been DMA'd there.

Basically, with speculative prefetching, we need to:

- for DMA_TO_DEVICE, clean the cache in the DMA map and do nothing on DMA unmap
- for DMA_FROM_DEVICE, invalidate the cache on DMA map and again on DMA unmap

Moreover, this isn't going to be trivial.  In order to do these cache
operations, we need the virtual address.  However, the unmap APIs aren't
given this information (though dma_to_virt() will do what's required for
the dma_unmap_single case.)

dma_unmap_page() on the other hand, we don't have any API existing to
turn a DMA address back into a page struct...

If only we had an architecture specific struct to contain whatever
information needs to be carried from map to unmap...

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH][RFC] asm-generic:remove calling flush_write_buffers() in dma_sync_*_for_cpu
  2009-07-07 14:06                             ` Arnd Bergmann
  2009-07-07 14:55                               ` Ming Lei
@ 2009-07-07 17:36                               ` Russell King - ARM Linux
  1 sibling, 0 replies; 28+ messages in thread
From: Russell King - ARM Linux @ 2009-07-07 17:36 UTC (permalink / raw)
  To: Arnd Bergmann
  Cc: Ming Lei, Alan Cox, Joerg Roedel, fujita.tomonori, linux-kernel,
	akpm, linux-arm-kernel

On Tue, Jul 07, 2009 at 04:06:48PM +0200, Arnd Bergmann wrote:
> Note that actually you need to do writeback+invalidate in DMA_TO_DEVICE

Rubbish.  DMA _to_ the device just needs any data held in the cache to
be pushed out into the hardware RAM.  There's absolutely no point in
invalidating it - that's a complete waste of time.

> and at least an invalidate in DMA_FROM_DEVICE during dma_map_*.
> For the unmap, I don't think you ever need to invalidate the cache.

As I've already explained, you do if you have speculative prefetching
which can bring in data into the cache from _any_ memory location at
_any_ time.  We _are_ going to have to do this, no questions asked.

^ permalink raw reply	[flat|nested] 28+ messages in thread

end of thread, other threads:[~2009-07-07 17:36 UTC | newest]

Thread overview: 28+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2009-06-28 14:39 [PATCH][RFC] asm-generic:remove calling flush_write_buffers() in dma_sync_*_for_cpu tom.leiming
2009-06-28 15:34 ` Arnd Bergmann
2009-06-29 12:31   ` Joerg Roedel
2009-06-29 13:51     ` Ming Lei
2009-06-29 14:45       ` Joerg Roedel
2009-06-29 14:54         ` Ming Lei
2009-06-29 15:44           ` Joerg Roedel
2009-06-29 16:22     ` Arnd Bergmann
2009-06-29 16:31       ` Alan Cox
2009-06-29 16:45         ` Arnd Bergmann
2009-06-29 17:16           ` Alan Cox
2009-06-30 12:34             ` Arnd Bergmann
2009-06-30 12:40               ` Alan Cox
2009-06-30 12:48                 ` Arnd Bergmann
2009-06-30 13:09                   ` Alan Cox
2009-06-30 13:38                     ` Arnd Bergmann
2009-07-07  1:54                       ` Ming Lei
2009-07-07  7:48                         ` Russell King - ARM Linux
2009-07-07 13:43                           ` Ming Lei
2009-07-07 14:06                             ` Arnd Bergmann
2009-07-07 14:55                               ` Ming Lei
2009-07-07 15:30                                 ` Arnd Bergmann
2009-07-07 17:36                               ` Russell King - ARM Linux
2009-07-07 17:33                             ` Russell King - ARM Linux
2009-06-29 18:47           ` Joerg Roedel
2009-06-29 19:10             ` Alan Cox
2009-06-29 19:24               ` Joerg Roedel
2009-06-29 18:48       ` Joerg Roedel

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox