* consistent_sync_for_cpu() and friends on ppc32
@ 2004-03-15 20:16 Olaf Hering
2004-03-15 20:36 ` David S. Miller
0 siblings, 1 reply; 8+ messages in thread
From: Olaf Hering @ 2004-03-15 20:16 UTC (permalink / raw)
To: David S. Miller, Benjamin Herrenschmidt; +Cc: linux-kernel
David,
what is the fix for ppc32? This patch went into Linus tree:
people/akpm/patches/2.6/2.6.4/2.6.4-mm1/broken-out/dma_sync_for_device-cpu.patch
In file included from include/linux/pci.h:720,
from drivers/net/sunhme.c:62:
include/asm/pci.h: In function `pci_dma_sync_single_for_cpu':
include/asm/pci.h:203: warning: implicit declaration of function `consistent_sync_for_cpu'
include/asm/pci.h: In function `pci_dma_sync_single_for_device':
include/asm/pci.h:212: warning: implicit declaration of function `consistent_sync_for_device'
include/asm/pci.h: In function `pci_dma_sync_sg_for_cpu':
include/asm/pci.h:230: warning: implicit declaration of function `consistent_sync_page_for_cpu'
include/asm/pci.h: In function `pci_dma_sync_sg_for_device':
include/asm/pci.h:243: warning: implicit declaration of function `consistent_sync_page_for_device'
--
USB is for mice, FireWire is for men!
sUse lINUX ag, nÜRNBERG
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: consistent_sync_for_cpu() and friends on ppc32
2004-03-15 20:16 consistent_sync_for_cpu() and friends on ppc32 Olaf Hering
@ 2004-03-15 20:36 ` David S. Miller
2004-03-15 21:59 ` Benjamin Herrenschmidt
2004-03-16 0:23 ` Benjamin Herrenschmidt
0 siblings, 2 replies; 8+ messages in thread
From: David S. Miller @ 2004-03-15 20:36 UTC (permalink / raw)
To: Olaf Hering; +Cc: benh, linux-kernel
On Mon, 15 Mar 2004 21:16:16 +0100
Olaf Hering <olh@suse.de> wrote:
> what is the fix for ppc32? This patch went into Linus tree:
> people/akpm/patches/2.6/2.6.4/2.6.4-mm1/broken-out/dma_sync_for_device-cpu.patch
...
> include/asm/pci.h: In function `pci_dma_sync_single_for_cpu':
Ben, can you work this out? I can make it compile by just making the
_for_cpu and _for_device routines behave identically to what the
consisten_sync{,_page}() stuff does now. But I'd much rather a ppc32
person implement it correctly and optimally.
In short, the _for_device routines should make sure cacheable data in
the cpu is fully visible to the DMA device, and _for_cpu should make
sure all device DMA is visible to the processor.
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: consistent_sync_for_cpu() and friends on ppc32
2004-03-15 20:36 ` David S. Miller
@ 2004-03-15 21:59 ` Benjamin Herrenschmidt
2004-03-16 0:23 ` Benjamin Herrenschmidt
1 sibling, 0 replies; 8+ messages in thread
From: Benjamin Herrenschmidt @ 2004-03-15 21:59 UTC (permalink / raw)
To: David S. Miller; +Cc: Olaf Hering, Linux Kernel list
> Ben, can you work this out? I can make it compile by just making the
> _for_cpu and _for_device routines behave identically to what the
> consisten_sync{,_page}() stuff does now. But I'd much rather a ppc32
> person implement it correctly and optimally.
>
> In short, the _for_device routines should make sure cacheable data in
> the cpu is fully visible to the DMA device, and _for_cpu should make
> sure all device DMA is visible to the processor.
Yup, it depends for what CPU the kernel is compiled, normal desktop
CPUs are completely coherent, so this will be a no-op, but 4xx/8xx
embedded CPUs will need something better. I'll have a look today
Ben.
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: consistent_sync_for_cpu() and friends on ppc32
2004-03-15 20:36 ` David S. Miller
2004-03-15 21:59 ` Benjamin Herrenschmidt
@ 2004-03-16 0:23 ` Benjamin Herrenschmidt
2004-03-16 0:49 ` David S. Miller
1 sibling, 1 reply; 8+ messages in thread
From: Benjamin Herrenschmidt @ 2004-03-16 0:23 UTC (permalink / raw)
To: David S. Miller; +Cc: Olaf Hering, Linux Kernel list
> Ben, can you work this out? I can make it compile by just making the
> _for_cpu and _for_device routines behave identically to what the
> consisten_sync{,_page}() stuff does now. But I'd much rather a ppc32
> person implement it correctly and optimally.
>
> In short, the _for_device routines should make sure cacheable data in
> the cpu is fully visible to the DMA device, and _for_cpu should make
> sure all device DMA is visible to the processor.
BTW, I missed your explanation in the first place, but why wouldn't
the "direction" field be enough ? I'm not sure if I need a different
implementation here...
Ben.
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: consistent_sync_for_cpu() and friends on ppc32
2004-03-16 0:23 ` Benjamin Herrenschmidt
@ 2004-03-16 0:49 ` David S. Miller
2004-03-16 1:36 ` Benjamin Herrenschmidt
0 siblings, 1 reply; 8+ messages in thread
From: David S. Miller @ 2004-03-16 0:49 UTC (permalink / raw)
To: Benjamin Herrenschmidt; +Cc: olh, linux-kernel
On Tue, 16 Mar 2004 11:23:42 +1100
Benjamin Herrenschmidt <benh@kernel.crashing.org> wrote:
> BTW, I missed your explanation in the first place, but why wouldn't
> the "direction" field be enough ? I'm not sure if I need a different
> implementation here...
Direction says something different. It says which direction the DMA
goes, whilst these interfaces say who wishes to have ownership of the
buffer now.
Consider this example, and how one might implement this on a system with
cpu caches which are not coherent with main memory nor devices.
1) User prepares buffer X with data.
2) pci_map_single(X, TO_DEVICE)
3) Device does DMA, interrupts cpu.
4) pci_dma_sync_single_for_cpu(X)
5) Write new contents.
6) pci_dma_sync_single_for_device(X)
7) Device does DMA again, interrupts cpu.
8) ...
Step 2 would writeback flush the cpu cache, step 4 would be a NOP,
step 6 would writeback flush the cpu cache.
The direction does not provide enough information to do these operations
with the right amount of information.
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: consistent_sync_for_cpu() and friends on ppc32
2004-03-16 0:49 ` David S. Miller
@ 2004-03-16 1:36 ` Benjamin Herrenschmidt
2004-03-16 18:46 ` David S. Miller
0 siblings, 1 reply; 8+ messages in thread
From: Benjamin Herrenschmidt @ 2004-03-16 1:36 UTC (permalink / raw)
To: David S. Miller; +Cc: Olaf Hering, Linux Kernel list
> 1) User prepares buffer X with data.
> 2) pci_map_single(X, TO_DEVICE)
> 3) Device does DMA, interrupts cpu.
> 4) pci_dma_sync_single_for_cpu(X)
> 5) Write new contents.
> 6) pci_dma_sync_single_for_device(X)
> 7) Device does DMA again, interrupts cpu.
> 8) ...
>
> Step 2 would writeback flush the cpu cache, step 4 would be a NOP,
> step 6 would writeback flush the cpu cache.
>
> The direction does not provide enough information to do these operations
> with the right amount of information.
Hrm... I'm still not sure how I'm supposed to implement those
for non-consistent PPCs (embedded). We don't carry state information
around, so I suppose I'll have to rely on the direction beeing the
same for the whole duration of the operation... In which case, it's
just a matter of having for_cpu nop'ing when direction is TO_DEVICE
and for_device nop'ing when direction is FROM_DEVICE ? Not clear
imho...
Ben.
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: consistent_sync_for_cpu() and friends on ppc32
2004-03-16 1:36 ` Benjamin Herrenschmidt
@ 2004-03-16 18:46 ` David S. Miller
2004-03-16 21:54 ` Benjamin Herrenschmidt
0 siblings, 1 reply; 8+ messages in thread
From: David S. Miller @ 2004-03-16 18:46 UTC (permalink / raw)
To: Benjamin Herrenschmidt; +Cc: olh, linux-kernel
On Tue, 16 Mar 2004 12:36:07 +1100
Benjamin Herrenschmidt <benh@kernel.crashing.org> wrote:
> Hrm... I'm still not sure how I'm supposed to implement those
> for non-consistent PPCs (embedded). We don't carry state information
> around, so I suppose I'll have to rely on the direction beeing the
> same for the whole duration of the operation... In which case, it's
> just a matter of having for_cpu nop'ing when direction is TO_DEVICE
> and for_device nop'ing when direction is FROM_DEVICE ? Not clear
> imho...
See, the direction really doesn't matter for the sync ops.
If you flush the cpu caches at MAP time, and your PCI controller doesn't
have DMA caching or something like that, then sync for CPU can always be
a nop. You will have always previously flushed the cpu caches before
giving the buffer back to the device, either via MAP or sync for device
calls.
So basically, make MAP and sync for device writeback flush the cpu caches.
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: consistent_sync_for_cpu() and friends on ppc32
2004-03-16 18:46 ` David S. Miller
@ 2004-03-16 21:54 ` Benjamin Herrenschmidt
0 siblings, 0 replies; 8+ messages in thread
From: Benjamin Herrenschmidt @ 2004-03-16 21:54 UTC (permalink / raw)
To: David S. Miller; +Cc: Olaf Hering, Linux Kernel list
> See, the direction really doesn't matter for the sync ops.
Well, the direction makes the difference between a flush and an
invalidation ;)
> If you flush the cpu caches at MAP time, and your PCI controller doesn't
> have DMA caching or something like that, then sync for CPU can always be
> a nop. You will have always previously flushed the cpu caches before
> giving the buffer back to the device, either via MAP or sync for device
> calls.
>
> So basically, make MAP and sync for device writeback flush the cpu caches.
No, flush on TO_DEVICE and BIDIRECTIONAL, invalidate on FROM_DEVICE,
it's less expensive to invalidate than flush in that case, since we
don't care about writing to real memory whatever junk the cache
contained for this area.
Ben.
^ permalink raw reply [flat|nested] 8+ messages in thread
end of thread, other threads:[~2004-03-16 22:07 UTC | newest]
Thread overview: 8+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2004-03-15 20:16 consistent_sync_for_cpu() and friends on ppc32 Olaf Hering
2004-03-15 20:36 ` David S. Miller
2004-03-15 21:59 ` Benjamin Herrenschmidt
2004-03-16 0:23 ` Benjamin Herrenschmidt
2004-03-16 0:49 ` David S. Miller
2004-03-16 1:36 ` Benjamin Herrenschmidt
2004-03-16 18:46 ` David S. Miller
2004-03-16 21:54 ` Benjamin Herrenschmidt
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox