linux-arm-kernel.lists.infradead.org archive mirror
 help / color / mirror / Atom feed
* DMA Engine API performance issues
       [not found] ` <p2q121039231004132344pae57964atbd6587f6ed38c471@mail.gmail.com>
@ 2010-04-14  6:46   ` melwyn lobo
  0 siblings, 0 replies; 4+ messages in thread
From: melwyn lobo @ 2010-04-14  6:46 UTC (permalink / raw)
  To: linux-arm-kernel

Subject edited to allow posting on ARM kernel list

On Wed, Apr 14, 2010 at 12:14 PM, melwyn lobo <linux.melwyn@gmail.com> wrote:
> Forwarding the query also to arm kernel mailiing list.
>
> Regards,
> M.
>
>
> ---------- Forwarded message ----------
> From: melwyn lobo <linux.melwyn@gmail.com>
> Date: Wed, Apr 14, 2010 at 12:12 PM
> Subject: DMA Engine API performance issues
> To: dan.j.williams at intel.com, linux-arm-kernel at lists.arm.linux.org.uk,
> linux-kernel at vger.kernel.org
>
>
> Hello Dan,
> I have some doubts regarding DMA API usage on its clients for example
> MMC, ALSA, USB etc.
>
> I am going to take example of the ALSA framework. Audio data transfer
> is initiated in soc_pcm_trigger(). This is called in an atomic
> context,
> with spinlock held and irqs disabled. Here most drivers enable data
> transfer from the MSP peripheral to the audio codec via tx_submit
> implementation
> of the DMA engine driver. This enqueues the transaction in an active
> list which calls for using spinlocks with bottom half disabled.
> It is in this function when spin_unlock_bh() is called the kernel
> detects irq's disabled and generates a warning.
> So the workaround here for ALSA drivers would be to use tasklet or
> workqueues to defer calling this in an interruptible context, which
> would
> cause performance problems (the same function soc_pcm_trigger is
> called for stoppping the transfer) in cases where the stream has to be
> repeatedly
> stopped and started.
>
> So the core issue is use of spin_unlock_bh in an atomic context.
> Workaround for removing the warnings would be:
> 1. Use spin_lock with irqsave and corresponding unlock function which
> does not generate a warning in a similar situation.
> ?But this could be futile in one case where the tasklet is scheduled
> from ksoftirqd which could lead to corruption.
> ?Also this means the interrupts would be disabled (on the local cpu)
> till the function executed which is not something
> ?desirable.
> 2. Use local_irq_enable() before calling the DMA APIs and disable once
> done. This is a crude solution and understandably undesirable and
> dangerous.
>
> The DMA Engine framework assumes that the channel interrupt handling
> is done in a tasklet (dma_run_dependencies), which I believe is the
> reason for the issue.
>
> Regards,
> M.
>

^ permalink raw reply	[flat|nested] 4+ messages in thread

* DMA Engine API performance issues
       [not found] <x2i121039231004132342sa230bfd4k832950fce534113f@mail.gmail.com>
       [not found] ` <p2q121039231004132344pae57964atbd6587f6ed38c471@mail.gmail.com>
@ 2010-04-15  0:08 ` Dan Williams
  2010-04-15  9:13   ` melwyn lobo
  1 sibling, 1 reply; 4+ messages in thread
From: Dan Williams @ 2010-04-15  0:08 UTC (permalink / raw)
  To: linux-arm-kernel

On Tue, Apr 13, 2010 at 11:42 PM, melwyn lobo <linux.melwyn@gmail.com> wrote:
> Hello Dan,
> I have some doubts regarding DMA API usage on its clients for example
> MMC, ALSA, USB etc.
>
> I am going to take example of the ALSA framework. Audio data transfer
> is initiated in soc_pcm_trigger(). This is called in an atomic
> context,
> with spinlock held and irqs disabled. Here most drivers enable data
> transfer from the MSP peripheral to the audio codec via tx_submit
> implementation
> of the DMA engine driver. This enqueues the transaction in an active
> list which calls for using spinlocks with bottom half disabled.
> It is in this function when spin_unlock_bh() is called the kernel
> detects irq's disabled and generates a warning.
> So the workaround here for ALSA drivers would be to use tasklet or
> workqueues to defer calling this in an interruptible context, which
> would
> cause performance problems (the same function soc_pcm_trigger is
> called for stoppping the transfer) in cases where the stream has to be
> repeatedly
> stopped and started.
>
> So the core issue is use of spin_unlock_bh in an atomic context.
> Workaround for removing the warnings would be:
> 1. Use spin_lock with irqsave and corresponding unlock function which
> does not generate a warning in a similar situation.
> ?But this could be futile in one case where the tasklet is scheduled
> from ksoftirqd which could lead to corruption.
> ?Also this means the interrupts would be disabled (on the local cpu)
> till the function executed which is not something
> ?desirable.
> 2. Use local_irq_enable() before calling the DMA APIs and disable once
> done. This is a crude solution and understandably undesirable and
> dangerous.
>
> The DMA Engine framework assumes that the channel interrupt handling
> is done in a tasklet (dma_run_dependencies), which I believe is the
> reason for the issue.

dma_run_dependencies() is only needed in the channel switching case
which really only applies to the raid/mem-to-mem usage model (i.e. xor
on one channel followed by copy on another).  In the mem-to-io model
you should not need to perform channel switching.  I suggest following
what the other mem-to-io drivers (ipu, dw_dmac, coh...) have
implemented with their locks.

In general the dmaengine api is meant to provide 1/ a method for
matching dma consumers with capable dma devices 2/ a platform agnostic
api for issuing mem-to-mem and simple mem-to-io (slave) dma.  If the
current framework provides everything you need, then by all means use
it, but you may find there are architecture specific concerns that
cannot be supported under the existing mem-to-io model.  In other
words the dmaengine abstraction stops being useful and gets in the way
when there are specific architecture considerations beyond simple
channel to slave-device associations.  For example, dw_dmac and ipu
are successfully using the dma-slave interface while the PXA folks are
sticking with their local dma api.

So use it if it simplifies your development more than it complicates it.

--
Dan

^ permalink raw reply	[flat|nested] 4+ messages in thread

* DMA Engine API performance issues
  2010-04-15  0:08 ` Dan Williams
@ 2010-04-15  9:13   ` melwyn lobo
  2010-04-15 15:58     ` Dan Williams
  0 siblings, 1 reply; 4+ messages in thread
From: melwyn lobo @ 2010-04-15  9:13 UTC (permalink / raw)
  To: linux-arm-kernel

Thanks so much for your explanation.
OK..Looking at IPU implementation, irq handling is enrtirely in the
interrupt routine, therefore no need for spin_lock_bh stuff.
But for lower interrupt latency reasons we need to stick with the
tasklet handling for processing interrupts.

We were contemplating another solution for locking at our end for DMA
implementation
For locking
if(irqs_disabled())
    spin_lock(&chan->lock);
else
    spin_lock_bh(&chan->lock);

for unlocking:
if(irqs_disabled())
    spin_unlock(&chan->lock);
else
    spin_unlock_bh(&chan->lock);

Although this works  fine, but will this be OK when we submit the driver.

Regards,
M

On Thu, Apr 15, 2010 at 5:38 AM, Dan Williams <dan.j.williams@intel.com> wrote:
> On Tue, Apr 13, 2010 at 11:42 PM, melwyn lobo <linux.melwyn@gmail.com> wrote:
>> Hello Dan,
>> I have some doubts regarding DMA API usage on its clients for example
>> MMC, ALSA, USB etc.
>>
>> I am going to take example of the ALSA framework. Audio data transfer
>> is initiated in soc_pcm_trigger(). This is called in an atomic
>> context,
>> with spinlock held and irqs disabled. Here most drivers enable data
>> transfer from the MSP peripheral to the audio codec via tx_submit
>> implementation
>> of the DMA engine driver. This enqueues the transaction in an active
>> list which calls for using spinlocks with bottom half disabled.
>> It is in this function when spin_unlock_bh() is called the kernel
>> detects irq's disabled and generates a warning.
>> So the workaround here for ALSA drivers would be to use tasklet or
>> workqueues to defer calling this in an interruptible context, which
>> would
>> cause performance problems (the same function soc_pcm_trigger is
>> called for stoppping the transfer) in cases where the stream has to be
>> repeatedly
>> stopped and started.
>>
>> So the core issue is use of spin_unlock_bh in an atomic context.
>> Workaround for removing the warnings would be:
>> 1. Use spin_lock with irqsave and corresponding unlock function which
>> does not generate a warning in a similar situation.
>> ?But this could be futile in one case where the tasklet is scheduled
>> from ksoftirqd which could lead to corruption.
>> ?Also this means the interrupts would be disabled (on the local cpu)
>> till the function executed which is not something
>> ?desirable.
>> 2. Use local_irq_enable() before calling the DMA APIs and disable once
>> done. This is a crude solution and understandably undesirable and
>> dangerous.
>>
>> The DMA Engine framework assumes that the channel interrupt handling
>> is done in a tasklet (dma_run_dependencies), which I believe is the
>> reason for the issue.
>
> dma_run_dependencies() is only needed in the channel switching case
> which really only applies to the raid/mem-to-mem usage model (i.e. xor
> on one channel followed by copy on another). ?In the mem-to-io model
> you should not need to perform channel switching. ?I suggest following
> what the other mem-to-io drivers (ipu, dw_dmac, coh...) have
> implemented with their locks.
>
> In general the dmaengine api is meant to provide 1/ a method for
> matching dma consumers with capable dma devices 2/ a platform agnostic
> api for issuing mem-to-mem and simple mem-to-io (slave) dma. ?If the
> current framework provides everything you need, then by all means use
> it, but you may find there are architecture specific concerns that
> cannot be supported under the existing mem-to-io model. ?In other
> words the dmaengine abstraction stops being useful and gets in the way
> when there are specific architecture considerations beyond simple
> channel to slave-device associations. ?For example, dw_dmac and ipu
> are successfully using the dma-slave interface while the PXA folks are
> sticking with their local dma api.
>
> So use it if it simplifies your development more than it complicates it.
>
> --
> Dan
>

^ permalink raw reply	[flat|nested] 4+ messages in thread

* DMA Engine API performance issues
  2010-04-15  9:13   ` melwyn lobo
@ 2010-04-15 15:58     ` Dan Williams
  0 siblings, 0 replies; 4+ messages in thread
From: Dan Williams @ 2010-04-15 15:58 UTC (permalink / raw)
  To: linux-arm-kernel

On Thu, Apr 15, 2010 at 2:13 AM, melwyn lobo <linux.melwyn@gmail.com> wrote:
> Thanks so much for your explanation.
> OK..Looking at IPU implementation, irq handling is enrtirely in the
> interrupt routine, therefore no need for spin_lock_bh stuff.
> But for lower interrupt latency reasons we need to stick with the
> tasklet handling for processing interrupts.
>
> We were contemplating another solution for locking at our end for DMA
> implementation
> For locking
> if(irqs_disabled())
> ? ?spin_lock(&chan->lock);
> else
> ? ?spin_lock_bh(&chan->lock);
>
> for unlocking:
> if(irqs_disabled())
> ? ?spin_unlock(&chan->lock);
> else
> ? ?spin_unlock_bh(&chan->lock);
>
> Although this works ?fine, but will this be OK when we submit the driver.
>

This won't work and lockdep should complain about this situation.  The
minute you need to take the lock in the interrupt handler then all
other occurrences of the lock need to be promoted to _irq or
_irqsave() [1].

--
Dan

[1]: http://www.kernel.org/pub/linux/kernel/people/rusty/kernel-locking/c214.html

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2010-04-15 15:58 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
     [not found] <x2i121039231004132342sa230bfd4k832950fce534113f@mail.gmail.com>
     [not found] ` <p2q121039231004132344pae57964atbd6587f6ed38c471@mail.gmail.com>
2010-04-14  6:46   ` DMA Engine API performance issues melwyn lobo
2010-04-15  0:08 ` Dan Williams
2010-04-15  9:13   ` melwyn lobo
2010-04-15 15:58     ` Dan Williams

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).