All of lore.kernel.org
 help / color / mirror / Atom feed
* non barrier versions of dma_map functions
@ 2009-12-07 19:37 adharmap at codeaurora.org
  2009-12-07 19:35 ` Russell King - ARM Linux
  0 siblings, 1 reply; 7+ messages in thread
From: adharmap at codeaurora.org @ 2009-12-07 19:37 UTC (permalink / raw)
  To: linux-arm-kernel

We have a situation where we need to dma map multiple cached buffers for a
single dma transaction.

The current DMA api suggests the use of dma_map_single for cache
consistency. On ARMv7 it performs the necessary cache-operations and calls
data sync barrier instruction (DSB). In our case we would be executing
multiple DSB instruction before starting the dma operation - we need
memory to be consistent only after we map the last buffer.

I am thinking we could define "no barrier" version's of all the mapping
functions and then a barrier function that results in DSB before the dma
is started.

Requesting alternative ideas or code design to get the desired nonbarrier
versions of the mapping functions.

Abhijeet Dharmapurikar

^ permalink raw reply	[flat|nested] 7+ messages in thread
* non barrier versions of dma_map functions
@ 2009-12-12  0:12 Abhijeet Dharmapurikar
  0 siblings, 0 replies; 7+ messages in thread
From: Abhijeet Dharmapurikar @ 2009-12-12  0:12 UTC (permalink / raw)
  To: linux-kernel


Hello  All,

   This is a request for extending the DMA api for efficient handling of 
multiple buffers or scatter gather mapping/unmapping operations.

   I am based on an ARMv7 device and we have a situation where we need 
to dma map multiple cached buffers for a single dma transaction.

   The current DMA api suggests the use of dma_map_single/ 
dma_unmap_single for cache consistency. On ARMv7 it performs the 
necessary cache-operations and calls data sync barrier instruction 
(DSB). In our case we would be executing multiple DSB instructions
before starting the dma operation - we need memory to be consistent
only after we map the last buffer.

   I am thinking we could define "no barrier" version's of all the 
mapping/unmapping functions and then a barrier function that results
in DSB before the dma is started.

   Here are numbers from a test ran on my board.

It kmallocs N buffers of size 'size', dirties their cache by writing
to them and calls dma_map_single that calls the arch specific clean
operations with and without DSB. In "without DSB" case a dsb is executed
after the last buffer is mapped. The time is in microseconds

size    N    map_single    map_single w/o DSB    delta
128    16    8             5                     60%
512    16    9             6                     50%
512    32    15            8                     88%
512    48    20            11                    82%
512    64    27            14                    93%
64     4     4             3                     33%
64     8     4             3                     33%
64     16    7             4                     75%
64     32    12            4                     200%
64     48    17            6                     183%
64     64    21            7                     200%
1024   16    9             7                     29%

These buffer sizes and N are very close to real world sizes the
framebuffer driver handles. Cases where N is large happen the most
often.

Clearly,we could benefit from the nobarrier versions of the cache
operations and we could use them in scatter gather mappings as well.

Since this kind of API change will affect all the platforms, I was 
directed by the arm-linux community to take this up on the linux kernel 
mailing list.

For architectures that don't need a barrier for the completions of
cache operations we can simply call the existing
dma_map_signle/dma_unmap_single.

Requesting alternative ideas or code design to get the desired 
nonbarrier versions of the mapping functions.

Thanks,
Abhijeet Dharmapurikar





^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2009-12-12  0:12 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2009-12-07 19:37 non barrier versions of dma_map functions adharmap at codeaurora.org
2009-12-07 19:35 ` Russell King - ARM Linux
2009-12-10  0:32   ` Abhijeet Dharmapurikar
2009-12-10  9:39     ` Catalin Marinas
2009-12-10 18:16       ` Abhijeet Dharmapurikar
2009-12-10 19:08         ` Russell King - ARM Linux
  -- strict thread matches above, loose matches on Subject: below --
2009-12-12  0:12 Abhijeet Dharmapurikar

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.