Proposal for more reliable audio DMA.

All of lore.kernel.org
 help / color / mirror / Atom feed

* Proposal for more reliable audio DMA.
@ 2009-06-21  2:06 Jon Smirl
  2009-06-22 16:27 ` James Courtier-Dutton
  0 siblings, 1 reply; 18+ messages in thread
From: Jon Smirl @ 2009-06-21  2:06 UTC (permalink / raw)
  To: alsa-devel mailing list

This algorithm is fairly similar to what currently exists in ALSA with
a few modifications. I'm trying to come up with guaranteed way to play
glitch free audio. Does this algorithm work or could it be modified to
work?

The main change is to track a high resolution timer as the means of
estimating where in the buffer to insert new, low latency data. This
tracking is done in a timer tick interrupt. Jiffies are too coarse and
not all hardware allows you to ask the current DMA position.

The other change is to keep all of the buffers filled with silence if
there isn't any pending data. (not doing this caused problem in my
mpc5200 AC97 driver)

Three buffers are used as a way to deterministically bound the DMA
pointer without needing interrupts.

---

Use three chained buffers.
Buffer size is samples/tick (or maybe 1.5 samples/tick)
Initialize buffers to silence.
Set end of buffer three to automatically terminate DMA

Set FIFO to minimum bus allows
Fill buffer one with samples available, start playing.

On timer tick call into driver..
If buffer one has finished playing move it to third position.
New last buffer is set to automatically stop DMA.
Return swap status to ALSA
If swapped, fill buffer three with silence or pending data
It is important to fill this buffer with silence if there is no pending data

The size of the buffers needs to be large enough to cover worst case
timer tick latency. Buffer two has to be large enough to ensure that
the tick routine will notice buffer one has been played before buffer
three starts.

If two buffers have been completed when the tick runs, let the third
buffer finish without swapping. Since you don't know where you are in
the third buffer, it is unsafe to swap at this point.

Playing the last buffer causes the end of buffer interrupt to happen.
Callback into ALSA to alert it of the underrun and need to restart DMA.
If this callback happens, make the buffers larger.

Long term observation of buffer completion status in tick handler will allow
accurate computation of samples/HPET unit. Record high accuracy timer
(HPET) on tick.

New low latency data that arrives can be inserted into the buffers dynamically.
Use the high resolution timer source to estimate where to place it.
Minimum FIFO ensures low latency play.

Design a minimum power mode.
Allows a huge FIFO to be loaded (like 128KB).
Call into driver to reset to minimum latency, small FIFO mode.

Advantages:

1) If ALSA goes away, hardware will stop on it's own with no noise.
2) No need to know the current position of DMA hardware.
3) Both low and high latency modes.
4) Audio interrupts only generated as error condition
5) Behavior is deterministic. Nothing is left to guess work.

-- 
Jon Smirl
jonsmirl@gmail.com

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: Proposal for more reliable audio DMA.
  2009-06-21  2:06 Jon Smirl
@ 2009-06-22 16:27 ` James Courtier-Dutton
  2009-06-22 16:43   ` Jon Smirl
  0 siblings, 1 reply; 18+ messages in thread
From: James Courtier-Dutton @ 2009-06-22 16:27 UTC (permalink / raw)
  To: Jon Smirl; +Cc: alsa-devel mailing list

2009/6/21 Jon Smirl <jonsmirl@gmail.com>:
> New last buffer is set to automatically stop DMA.
How do you do this?
DMA transfers on sound cards are a ring buffer. There is no automatic
stop feature.
You set the dma pointers, start dma going in a loop and that is it.
You can then stop the DMA on command but not as a result of a buffer end.

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: Proposal for more reliable audio DMA.
  2009-06-22 16:27 ` James Courtier-Dutton
@ 2009-06-22 16:43   ` Jon Smirl
  2009-06-23  9:54     ` Mark Brown
  0 siblings, 1 reply; 18+ messages in thread
From: Jon Smirl @ 2009-06-22 16:43 UTC (permalink / raw)
  To: James Courtier-Dutton; +Cc: alsa-devel mailing list

On Mon, Jun 22, 2009 at 12:27 PM, James
Courtier-Dutton<james.dutton@gmail.com> wrote:
> 2009/6/21 Jon Smirl <jonsmirl@gmail.com>:
>> New last buffer is set to automatically stop DMA.
> How do you do this?
> DMA transfers on sound cards are a ring buffer. There is no automatic
> stop feature.

I don't about all hardware, but all of the hardware I've worked with
works both ways, ring or stop at the end.

DMA transfers for network packets wouldn't work in the ring buffer
model, you need the stop at the end capability.

> You set the dma pointers, start dma going in a loop and that is it.
> You can then stop the DMA on command but not as a result of a buffer end.
>



-- 
Jon Smirl
jonsmirl@gmail.com

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: Proposal for more reliable audio DMA.
  2009-06-22 16:43   ` Jon Smirl
@ 2009-06-23  9:54     ` Mark Brown
  2009-06-24 14:10       ` Jon Smirl
  0 siblings, 1 reply; 18+ messages in thread
From: Mark Brown @ 2009-06-23  9:54 UTC (permalink / raw)
  To: Jon Smirl; +Cc: alsa-devel mailing list, James Courtier-Dutton

On Mon, Jun 22, 2009 at 12:43:52PM -0400, Jon Smirl wrote:
> On Mon, Jun 22, 2009 at 12:27 PM, James

> > DMA transfers on sound cards are a ring buffer. There is no automatic
> > stop feature.

> I don't about all hardware, but all of the hardware I've worked with
> works both ways, ring or stop at the end.

> DMA transfers for network packets wouldn't work in the ring buffer
> model, you need the stop at the end capability.

Remember, you're working with a general purpose SoC which shares the DMA
controller with a large selection of other hardware.  A DMA controller
that's part of a sound device and can't be used in anything else doesn't
need to worry about any other applications.

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: Proposal for more reliable audio DMA.
  2009-06-23  9:54     ` Mark Brown
@ 2009-06-24 14:10       ` Jon Smirl
  2009-06-24 14:39         ` Takashi Iwai
  0 siblings, 1 reply; 18+ messages in thread
From: Jon Smirl @ 2009-06-24 14:10 UTC (permalink / raw)
  To: Mark Brown; +Cc: alsa-devel mailing list, James Courtier-Dutton

On Tue, Jun 23, 2009 at 5:54 AM, Mark
Brown<broonie@opensource.wolfsonmicro.com> wrote:
> On Mon, Jun 22, 2009 at 12:43:52PM -0400, Jon Smirl wrote:
>> On Mon, Jun 22, 2009 at 12:27 PM, James
>
>> > DMA transfers on sound cards are a ring buffer. There is no automatic
>> > stop feature.
>
>> I don't about all hardware, but all of the hardware I've worked with
>> works both ways, ring or stop at the end.
>
>> DMA transfers for network packets wouldn't work in the ring buffer
>> model, you need the stop at the end capability.
>
> Remember, you're working with a general purpose SoC which shares the DMA
> controller with a large selection of other hardware.  A DMA controller
> that's part of a sound device and can't be used in anything else doesn't
> need to worry about any other applications.
>

>From what I have observed the current ALSA DMA design does not
reliably deal with over/underrun.  On the hardware I'm using it is
possible to construct a system which will always behave predictably
but I can't build it using the ALSA driver interface.

These issues probably indicates that the DMA interface between ALSA
and the driver has been designed at the wrong level. For example those
timers trying to fix glitches in HDA belong down in the HDA driver,
not the core. Why did my DMA code needs to peak back into ALSA core at
appl pointer?  The proliferation of flags on the DMA interface is also
an indication that it is too low level.

I'm still working on solutions for my embedded application but I may
be forced to add private IOCTLs to the driver and by-pass ASLA.  That
will work for me since I'm not building a general purpose system.



-- 
Jon Smirl
jonsmirl@gmail.com

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: Proposal for more reliable audio DMA.
  2009-06-24 14:10       ` Jon Smirl
@ 2009-06-24 14:39         ` Takashi Iwai
  2009-06-24 15:14           ` Jon Smirl
  0 siblings, 1 reply; 18+ messages in thread
From: Takashi Iwai @ 2009-06-24 14:39 UTC (permalink / raw)
  To: Jon Smirl; +Cc: alsa-devel mailing list, Mark Brown, James Courtier-Dutton

At Wed, 24 Jun 2009 10:10:35 -0400,
Jon Smirl wrote:
> 
> On Tue, Jun 23, 2009 at 5:54 AM, Mark
> Brown<broonie@opensource.wolfsonmicro.com> wrote:
> > On Mon, Jun 22, 2009 at 12:43:52PM -0400, Jon Smirl wrote:
> >> On Mon, Jun 22, 2009 at 12:27 PM, James
> >
> >> > DMA transfers on sound cards are a ring buffer. There is no automatic
> >> > stop feature.
> >
> >> I don't about all hardware, but all of the hardware I've worked with
> >> works both ways, ring or stop at the end.
> >
> >> DMA transfers for network packets wouldn't work in the ring buffer
> >> model, you need the stop at the end capability.
> >
> > Remember, you're working with a general purpose SoC which shares the DMA
> > controller with a large selection of other hardware.  A DMA controller
> > that's part of a sound device and can't be used in anything else doesn't
> > need to worry about any other applications.
> >
> 
> From what I have observed the current ALSA DMA design does not
> reliably deal with over/underrun.  On the hardware I'm using it is
> possible to construct a system which will always behave predictably
> but I can't build it using the ALSA driver interface.

That's true for your hardware.  But not for most hardware with
simple "setup-go-and-dont-touch-anymore" style DMA.

> These issues probably indicates that the DMA interface between ALSA
> and the driver has been designed at the wrong level.

Partly true.  ALSA PCM was designed for most ISA/PCI DMA transfer
model, not for embedded devices.  (BTW, it means that your proposal
can't be applied easily to most of these devices because their DMA
setup cannot be changed at all while DMA is running...)

> For example those
> timers trying to fix glitches in HDA belong down in the HDA driver,
> not the core.

Basically, XRUN can be avoided very easily.  Simply have an enough
large buffer.  The rest question is how to fill the buffer.  If you
don't want to give h/w interrupts from the sound chip, you need any
other timing source to sync with the position.  But, XRUN is simply a
matter of the buffer size and the latency of the system.

The glitch-free problem of PA comes from the fact that PA assumes that
the driver returns the current hw position accurately at any time.
But, in many hardwards, including HDA, this is not true.  The hardware
lies.  It doesn't report the right position at all.  Thus, there are
many workarounds implemented in HD-audio side.

So, before a discussion goes chaotic, I'd like to separate two issues:
- how to avoid XRUN
- how to detect XRUN

The former is what I mentioned in the above.

The latter pretty depends on the hardware, and your proposal would
help (if it were possible for the target hardware).

Actually, for me, your proposal looks rather like a redesign of PCM
core to fit better with specific embedded devices.  That's fine, and
I've been thinking of a way to improve the core model.  But, it would
merely help for "reliability" in general, if you look at all devices
we must support.

thanks,

Takashi

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: Proposal for more reliable audio DMA.
  2009-06-24 14:39         ` Takashi Iwai
@ 2009-06-24 15:14           ` Jon Smirl
  2009-06-24 15:24             ` Takashi Iwai
  2009-06-25 15:26             ` James Courtier-Dutton
  0 siblings, 2 replies; 18+ messages in thread
From: Jon Smirl @ 2009-06-24 15:14 UTC (permalink / raw)
  To: Takashi Iwai; +Cc: alsa-devel mailing list, Mark Brown, James Courtier-Dutton

On Wed, Jun 24, 2009 at 10:39 AM, Takashi Iwai<tiwai@suse.de> wrote:
> The glitch-free problem of PA comes from the fact that PA assumes that
> the driver returns the current hw position accurately at any time.
> But, in many hardwares, including HDA, this is not true.  The hardware
> lies.  It doesn't report the right position at all.  Thus, there are
> many workarounds implemented in HD-audio side.

Why does pulse need to know the DMA position?

If it is so that it can write into the buffer with minimal latency
there are other ways to accomplish that. The simplest is to just add
an entry into ALSA core that says, play this buffer with minimal
latency. That would let the transfer be pushed down into the specific
driver and that driver could handle it in an optimal way.

Optimal on my hardware would be to reprogram the DMA hardware to
immediately start playing from the new buffer. No copies involved. The
FIFO will hide the buffer swap from the audio hardware.

Optimal on ring buffer hardware would be to locate where DMA was in
the ring and copy to a position in front of it.

These hardware differences should be hidden from pulse.

-- 
Jon Smirl
jonsmirl@gmail.com

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: Proposal for more reliable audio DMA.
  2009-06-24 15:14           ` Jon Smirl
@ 2009-06-24 15:24             ` Takashi Iwai
  2009-06-24 16:07               ` Jon Smirl
  2009-06-25 15:26             ` James Courtier-Dutton
  1 sibling, 1 reply; 18+ messages in thread
From: Takashi Iwai @ 2009-06-24 15:24 UTC (permalink / raw)
  To: Jon Smirl; +Cc: alsa-devel mailing list, Mark Brown, James Courtier-Dutton

At Wed, 24 Jun 2009 11:14:33 -0400,
Jon Smirl wrote:
> 
> On Wed, Jun 24, 2009 at 10:39 AM, Takashi Iwai<tiwai@suse.de> wrote:
> > The glitch-free problem of PA comes from the fact that PA assumes that
> > the driver returns the current hw position accurately at any time.
> > But, in many hardwares, including HDA, this is not true.  The hardware
> > lies.  It doesn't report the right position at all.  Thus, there are
> > many workarounds implemented in HD-audio side.
> 
> Why does pulse need to know the DMA position?
> 
> If it is so that it can write into the buffer with minimal latency
> there are other ways to accomplish that. The simplest is to just add
> an entry into ALSA core that says, play this buffer with minimal
> latency. That would let the transfer be pushed down into the specific
> driver and that driver could handle it in an optimal way.

To get a minimum latency, you need to program the sound hardware
to notify that.  And, most hardware can't do anything but issue IRQs
periodically at buffer, fragment or period or whatever boundary.

Instead of the hardware irqs (that can wake up too often), PA uses the
timer.  Then it must check the position because the clock on the
hardware isn't quite accurate and very likely you'll get a drift
sooner or later if you use the other timer source.

Yes, there can be optimizations for hardwares that are capable to
notify any free size chunks.  This is a missing piece.

> Optimal on my hardware would be to reprogram the DMA hardware to
> immediately start playing from the new buffer. No copies involved. The
> FIFO will hide the buffer swap from the audio hardware.

Yes.  But it's rather a rare case that you can do such an operation.

> Optimal on ring buffer hardware would be to locate where DMA was in
> the ring and copy to a position in front of it.

That's exactly what PA does.  But you must know where to copy
beforehand.  And the hardware lies where is the current position.

Takashi

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: Proposal for more reliable audio DMA.
  2009-06-24 15:24             ` Takashi Iwai
@ 2009-06-24 16:07               ` Jon Smirl
  0 siblings, 0 replies; 18+ messages in thread
From: Jon Smirl @ 2009-06-24 16:07 UTC (permalink / raw)
  To: Takashi Iwai; +Cc: alsa-devel mailing list, Mark Brown, James Courtier-Dutton

On Wed, Jun 24, 2009 at 11:24 AM, Takashi Iwai<tiwai@suse.de> wrote:
> At Wed, 24 Jun 2009 11:14:33 -0400,
> Jon Smirl wrote:
>>
>> On Wed, Jun 24, 2009 at 10:39 AM, Takashi Iwai<tiwai@suse.de> wrote:
>> > The glitch-free problem of PA comes from the fact that PA assumes that
>> > the driver returns the current hw position accurately at any time.
>> > But, in many hardwares, including HDA, this is not true.  The hardware
>> > lies.  It doesn't report the right position at all.  Thus, there are
>> > many workarounds implemented in HD-audio side.
>>
>> Why does pulse need to know the DMA position?
>>
>> If it is so that it can write into the buffer with minimal latency
>> there are other ways to accomplish that. The simplest is to just add
>> an entry into ALSA core that says, play this buffer with minimal
>> latency. That would let the transfer be pushed down into the specific
>> driver and that driver could handle it in an optimal way.
>
> To get a minimum latency, you need to program the sound hardware
> to notify that.  And, most hardware can't do anything but issue IRQs
> periodically at buffer, fragment or period or whatever boundary.
>
> Instead of the hardware irqs (that can wake up too often), PA uses the
> timer.  Then it must check the position because the clock on the
> hardware isn't quite accurate and very likely you'll get a drift
> sooner or later if you use the other timer source.

Pulse should not need to mess with this.  Doesn't pulse need to do
just two things?
1) play this buffer as soon as possible - abandon previously queued samples
2) ALSA call back saying I have room for more data in my queue
  2a) ALSA call back saying underrun happen

When pulse gets new low latency data from a client app, it should make
a new buffer, call into ALSA core and say, play this buffer ASAP. This
new buffer would replace the old one. Pulse does not need to know
where the DMA pointer is or mess with timers. Messing with those
should be internal to ALSA and it's drivers.  The ring buffer hardware
model should not be exposed to pulse, especially since not all
hardware has ring buffers.

Inside ALSA this play-ASAP buffer should just be passed into the
driver. Only the driver really know how to play something ASAP. The
implementation of play-ASAP will be quite different on ring buffer
hardware, scatter/gather DMA hardware or hardware that needs indirect
access to the buffer.

Trying to directly expose the ring buffer to the app seems like a good
way to avoid a copy, but it isn't achieving that. pulse is not
decoding audio straight into the ring buffer, it decodes first and
then copies into the buffer. Pulse is using the timers to estimate the
destination for the copy.  Move this copy down into the drivers, the
drivers know the correct destination for the copy.

>
> Yes, there can be optimizations for hardwares that are capable to
> notify any free size chunks.  This is a missing piece.
>
>> Optimal on my hardware would be to reprogram the DMA hardware to
>> immediately start playing from the new buffer. No copies involved. The
>> FIFO will hide the buffer swap from the audio hardware.
>
> Yes.  But it's rather a rare case that you can do such an operation.
>
>> Optimal on ring buffer hardware would be to locate where DMA was in
>> the ring and copy to a position in front of it.
>
> That's exactly what PA does.  But you must know where to copy
> beforehand.  And the hardware lies where is the current position.
>
>
> Takashi
>

-- 
Jon Smirl
jonsmirl@gmail.com

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: Proposal for more reliable audio DMA.
@ 2009-06-24 16:28 Mark Brown
  2009-06-24 19:07 ` Jon Smirl
  0 siblings, 1 reply; 18+ messages in thread
From: Mark Brown @ 2009-06-24 16:28 UTC (permalink / raw)
  To: Jon Smirl; +Cc: Takashi Iwai, alsa-devel mailing list, James Courtier-Dutton

On 24 Jun 2009, at 17:07, Jon Smirl <jonsmirl@gmail.com> wrote:

> Trying to directly expose the ring buffer to the app seems like a good
>
> way to avoid a copy, but it isn't achieving that. pulse is not
> decoding audio straight into the ring buffer, it decodes first and
> then copies into the buffer. Pulse is using the timers to estimate the
> destination for the copy.  Move this copy down into the drivers, the
> drivers know the correct destination for the copy.

Pulse isn't just doing straight playback, a large part of what it's  
there for is to do software mixing. When you have multiple sources  
active pulse is going to be forced to do the copy as part of the  
mixing process so putting that bit of the buffer management in kernel  
won't help in the way you think it does. Part of what's going on here  
is that the kernel code is trying to give userspace access to the data  
for as long as possible.

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: Proposal for more reliable audio DMA.
  2009-06-24 16:28 Proposal for more reliable audio DMA Mark Brown
@ 2009-06-24 19:07 ` Jon Smirl
  2009-06-24 21:11   ` Takashi Iwai
  2009-06-25 10:25   ` Mark Brown
  0 siblings, 2 replies; 18+ messages in thread
From: Jon Smirl @ 2009-06-24 19:07 UTC (permalink / raw)
  To: Mark Brown; +Cc: Takashi Iwai, alsa-devel mailing list, James Courtier-Dutton

On Wed, Jun 24, 2009 at 12:28 PM, Mark
Brown<broonie@opensource.wolfsonmicro.com> wrote:
> On 24 Jun 2009, at 17:07, Jon Smirl <jonsmirl@gmail.com> wrote:
>
>> Trying to directly expose the ring buffer to the app seems like a good
>>
>> way to avoid a copy, but it isn't achieving that. pulse is not
>> decoding audio straight into the ring buffer, it decodes first and
>> then copies into the buffer. Pulse is using the timers to estimate the
>> destination for the copy.  Move this copy down into the drivers, the
>> drivers know the correct destination for the copy.
>
> Pulse isn't just doing straight playback, a large part of what it's there
> for is to do software mixing. When you have multiple sources active pulse is
> going to be forced to do the copy as part of the mixing process so putting
> that bit of the buffer management in kernel won't help in the way you think
> it does. Part of what's going on here is that the kernel code is trying to
> give userspace access to the data for as long as possible.
>

Does this work as a use case?

Pulse is playing music in the background.  A game want to do a laser blast.

Pulse has already sent a buffer into kernel for the background music.
Pulse makes a new buffer that contains the background mixed with the laser.
It sends this new buffer into the kernel and says play with minimum latency.

The problem is knowing which sample in the background music to start
mixing the low latency laser blast into. ALSA will need to know this
index to figure out where to switch onto the replacement buffer. This
offset is dynamic and it depends on how much work pulse is doing.

So you need two things. An estimate of the current playing sample and
an estimate of the system latency to know where to start mixing.

Estimating the current sample can be done accurately by using a high
frequency, free running counter. Drift can be compensated for by
recording the count when the hardware accurately knows what sample it
is on. Knowing how many samples to delay before mixing is a function
of app latency and it needs to be measured in a feedback loop.

I'm starting to think the OSS model is right and mixing belongs in the kernel

-- 
Jon Smirl
jonsmirl@gmail.com

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: Proposal for more reliable audio DMA.
  2009-06-24 19:07 ` Jon Smirl
@ 2009-06-24 21:11   ` Takashi Iwai
  2009-06-25  4:26     ` Jon Smirl
  2009-06-25 10:25   ` Mark Brown
  1 sibling, 1 reply; 18+ messages in thread
From: Takashi Iwai @ 2009-06-24 21:11 UTC (permalink / raw)
  To: Jon Smirl; +Cc: alsa-devel mailing list, Mark Brown, James Courtier-Dutton

At Wed, 24 Jun 2009 15:07:17 -0400,
Jon Smirl wrote:
> 
> On Wed, Jun 24, 2009 at 12:28 PM, Mark
> Brown<broonie@opensource.wolfsonmicro.com> wrote:
> > On 24 Jun 2009, at 17:07, Jon Smirl <jonsmirl@gmail.com> wrote:
> >
> >> Trying to directly expose the ring buffer to the app seems like a good
> >>
> >> way to avoid a copy, but it isn't achieving that. pulse is not
> >> decoding audio straight into the ring buffer, it decodes first and
> >> then copies into the buffer. Pulse is using the timers to estimate the
> >> destination for the copy.  Move this copy down into the drivers, the
> >> drivers know the correct destination for the copy.
> >
> > Pulse isn't just doing straight playback, a large part of what it's there
> > for is to do software mixing. When you have multiple sources active pulse is
> > going to be forced to do the copy as part of the mixing process so putting
> > that bit of the buffer management in kernel won't help in the way you think
> > it does. Part of what's going on here is that the kernel code is trying to
> > give userspace access to the data for as long as possible.
> >
> 
> Does this work as a use case?
> 
> Pulse is playing music in the background.  A game want to do a laser blast.
> 
> Pulse has already sent a buffer into kernel for the background music.
> Pulse makes a new buffer that contains the background mixed with the laser.
> It sends this new buffer into the kernel and says play with minimum latency.

When it's mmapped and can be updated on the fly, no need to resend the
buffer.  You can just rewrite it.

> The problem is knowing which sample in the background music to start
> mixing the low latency laser blast into.

That's why querying the accurate hwptr is important in PA.


Takashi

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: Proposal for more reliable audio DMA.
  2009-06-24 21:11   ` Takashi Iwai
@ 2009-06-25  4:26     ` Jon Smirl
  2009-06-25  5:36       ` Robert Hancock
  0 siblings, 1 reply; 18+ messages in thread
From: Jon Smirl @ 2009-06-25  4:26 UTC (permalink / raw)
  To: Takashi Iwai; +Cc: alsa-devel mailing list, Mark Brown, James Courtier-Dutton

On Wed, Jun 24, 2009 at 5:11 PM, Takashi Iwai<tiwai@suse.de> wrote:
>> The problem is knowing which sample in the background music to start
>> mixing the low latency laser blast into.
>
> That's why querying the accurate hwptr is important in PA.

I'm still not convinced that all of this logic should be exposed to
PA. Exposing these details is what makes ALSA hard to use. We should
be able to better isolate user space from this. If mixing were moved
into the kernel these details could be hidden.  The in-kernel code
could then be customized for various sound DMA hardware. This would
also go a long ways toward getting rid of latency issues by removing
the need for real-time response from PA.

My hardware doesn't have the capability of querying the hwptr and the
hwptr speed is not linear because of the FIFO and burst transfers.
Non-linear speed means I can't use a clock to estimate hwptr. I do
however have the capability of directing the DMA into a new buffer.
Another thing I could try is setting up DMA descriptor chain blocks
for every 16 bytes. These descriptors get marked as they are used and
they don't have to cause an interrupt.

We are evaluating a processor change from PPC to ARM so all of this
may change for me.

-- 
Jon Smirl
jonsmirl@gmail.com

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: Proposal for more reliable audio DMA.
  2009-06-25  4:26     ` Jon Smirl
@ 2009-06-25  5:36       ` Robert Hancock
  2009-06-25 14:20         ` Jon Smirl
  0 siblings, 1 reply; 18+ messages in thread
From: Robert Hancock @ 2009-06-25  5:36 UTC (permalink / raw)
  To: Jon Smirl
  Cc: Takashi Iwai, alsa-devel mailing list, Mark Brown,
	James Courtier-Dutton

On 06/24/2009 10:26 PM, Jon Smirl wrote:
> On Wed, Jun 24, 2009 at 5:11 PM, Takashi Iwai<tiwai@suse.de>  wrote:
>>> The problem is knowing which sample in the background music to start
>>> mixing the low latency laser blast into.
>> That's why querying the accurate hwptr is important in PA.
>
> I'm still not convinced that all of this logic should be exposed to
> PA. Exposing these details is what makes ALSA hard to use. We should
> be able to better isolate user space from this. If mixing were moved
> into the kernel these details could be hidden.  The in-kernel code
> could then be customized for various sound DMA hardware. This would
> also go a long ways toward getting rid of latency issues by removing
> the need for real-time response from PA.

Mixing really does not belong in the kernel. Moving it there doesn't 
remove any complication or problem, it just moves it to a different 
place where it's more difficult to program and less debuggable. Most 
OSes (Windows included) are moving in the direction of moving mixing out 
of the kernel, not into it.

For what PulseAudio is trying to do, it needs this kind of information 
because it wants to be able to rewrite the buffer the card is reading 
out of at any time, and it needs to be able to know how far along in the 
buffer the card has read so it knows where it can start rewriting. It's 
somewhat complicated for sure, but most normal applications don't have 
to deal with these kinds of details.

>
> My hardware doesn't have the capability of querying the hwptr and the
> hwptr speed is not linear because of the FIFO and burst transfers.
> Non-linear speed means I can't use a clock to estimate hwptr. I do
> however have the capability of directing the DMA into a new buffer.
> Another thing I could try is setting up DMA descriptor chain blocks
> for every 16 bytes. These descriptors get marked as they are used and
> they don't have to cause an interrupt.
>
> We are evaluating a processor change from PPC to ARM so all of this
> may change for me.
>

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: Proposal for more reliable audio DMA.
  2009-06-24 19:07 ` Jon Smirl
  2009-06-24 21:11   ` Takashi Iwai
@ 2009-06-25 10:25   ` Mark Brown
  1 sibling, 0 replies; 18+ messages in thread
From: Mark Brown @ 2009-06-25 10:25 UTC (permalink / raw)
  To: Jon Smirl; +Cc: Takashi Iwai, alsa-devel mailing list, James Courtier-Dutton

On Wed, Jun 24, 2009 at 03:07:17PM -0400, Jon Smirl wrote:
> On Wed, Jun 24, 2009 at 12:28 PM, Mark
> Brown<broonie@opensource.wolfsonmicro.com> wrote:

> > it does. Part of what's going on here is that the kernel code is trying to
> > give userspace access to the data for as long as possible.

> The problem is knowing which sample in the background music to start
> mixing the low latency laser blast into. ALSA will need to know this
> index to figure out where to switch onto the replacement buffer. This
> offset is dynamic and it depends on how much work pulse is doing.

Of course, some hardware is not going to allow the DMA controller to be
reprogrammed while active so would need to either wait for a buffer
boundary or update the data in the current buffer as is currently done.

> I'm starting to think the OSS model is right and mixing belongs in the kernel

This isn't a kernel/user problem.  Exactly the same issues come up if
the code pushing data into the driver is in the kernel, it'll still want
as much information as possible about what the current status is.

Moving any non-hardware stuff into the kernel would create more problems
than it solves.  Remember that ALSA supports arbitrary plugin stacks -
users could be doing signal processing on the data post mix, for example
doing soft EQ or 3D enhancement.  Some of this can be done pre-mix but
it'll always be less efficient and in some cases would interfere with
the operation of the algorithms.

Remember also that hardware output is just one option for ALSA.  You can
also have output plugins that do things like send data over the network.

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: Proposal for more reliable audio DMA.
  2009-06-25  5:36       ` Robert Hancock
@ 2009-06-25 14:20         ` Jon Smirl
  2009-06-25 23:26           ` Robert Hancock
  0 siblings, 1 reply; 18+ messages in thread
From: Jon Smirl @ 2009-06-25 14:20 UTC (permalink / raw)
  To: Robert Hancock
  Cc: Takashi Iwai, alsa-devel mailing list, Mark Brown,
	James Courtier-Dutton

On Thu, Jun 25, 2009 at 1:36 AM, Robert Hancock<hancockrwd@gmail.com> wrote:
> On 06/24/2009 10:26 PM, Jon Smirl wrote:
>>
>> On Wed, Jun 24, 2009 at 5:11 PM, Takashi Iwai<tiwai@suse.de>  wrote:
>>>>
>>>> The problem is knowing which sample in the background music to start
>>>> mixing the low latency laser blast into.
>>>
>>> That's why querying the accurate hwptr is important in PA.
>>
>> I'm still not convinced that all of this logic should be exposed to
>> PA. Exposing these details is what makes ALSA hard to use. We should
>> be able to better isolate user space from this. If mixing were moved
>> into the kernel these details could be hidden.  The in-kernel code
>> could then be customized for various sound DMA hardware. This would
>> also go a long ways toward getting rid of latency issues by removing
>> the need for real-time response from PA.
>
> Mixing really does not belong in the kernel. Moving it there doesn't remove
> any complication or problem, it just moves it to a different place where
> it's more difficult to program and less debuggable. Most OSes (Windows
> included) are moving in the direction of moving mixing out of the kernel,
> not into it.

Mixing has a real-time component to it. Currently Desktop Linux
doesn't have real-time support. That's why pulse is developing
RealTimeKit.  Buggy real-time code can easily lock your machine to
where you need to hit the reset button.

User space code that is locked down with real-time priority and
servicing interrupts is effectively kernel code, it might as well be
in the kernel where it can get rid of the process overhead.

http://git.0pointer.de/?p=rtkit.git;a=blob;f=README

>
> For what PulseAudio is trying to do, it needs this kind of information
> because it wants to be able to rewrite the buffer the card is reading out of
> at any time, and it needs to be able to know how far along in the buffer the
> card has read so it knows where it can start rewriting. It's somewhat
> complicated for sure, but most normal applications don't have to deal with
> these kinds of details.
>
>>
>> My hardware doesn't have the capability of querying the hwptr and the
>> hwptr speed is not linear because of the FIFO and burst transfers.
>> Non-linear speed means I can't use a clock to estimate hwptr. I do
>> however have the capability of directing the DMA into a new buffer.
>> Another thing I could try is setting up DMA descriptor chain blocks
>> for every 16 bytes. These descriptors get marked as they are used and
>> they don't have to cause an interrupt.
>>
>> We are evaluating a processor change from PPC to ARM so all of this
>> may change for me.
>>
>
>



-- 
Jon Smirl
jonsmirl@gmail.com

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: Proposal for more reliable audio DMA.
  2009-06-24 15:14           ` Jon Smirl
  2009-06-24 15:24             ` Takashi Iwai
@ 2009-06-25 15:26             ` James Courtier-Dutton
  1 sibling, 0 replies; 18+ messages in thread
From: James Courtier-Dutton @ 2009-06-25 15:26 UTC (permalink / raw)
  To: Jon Smirl; +Cc: Takashi Iwai, alsa-devel mailing list, Mark Brown

2009/6/24 Jon Smirl <jonsmirl@gmail.com>:
> If it is so that it can write into the buffer with minimal latency
> there are other ways to accomplish that. The simplest is to just add
> an entry into ALSA core that says, play this buffer with minimal
> latency. That would let the transfer be pushed down into the specific
> driver and that driver could handle it in an optimal way.
>
"minimal latency" is not the only requirement.
Another is "play this sample at a predictable time." in order to
ensure it plays in sync with video for example.
The real problem is ensuring that the application reacts in time to
fill up the buffer before it empties. So, the ideal would be a way to
ensure that "process X gets woken just before the buffer empties". One
way to trigger this is via the sound card hardware interrupt, another
is by using the global timer to wake one up at a particular time.
Using the timer is probably better because the wake up is then more
granular, but one then needs a way to sync the timer with the hardware
clock on the sound card.
I would like a kernel scheduling api that could reliably do "wake me
up at nanosecond X" but none exists. I can do "nanosleep(X)" which
will maybe wake me up in X nanoseconds, but it is not very reliable.
Another useful scheduling api would be "make sure process X is woken
up after me".

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: Proposal for more reliable audio DMA.
  2009-06-25 14:20         ` Jon Smirl
@ 2009-06-25 23:26           ` Robert Hancock
  0 siblings, 0 replies; 18+ messages in thread
From: Robert Hancock @ 2009-06-25 23:26 UTC (permalink / raw)
  To: Jon Smirl
  Cc: Takashi Iwai, alsa-devel mailing list, Mark Brown,
	James Courtier-Dutton

On 06/25/2009 08:20 AM, Jon Smirl wrote:
> On Thu, Jun 25, 2009 at 1:36 AM, Robert Hancock<hancockrwd@gmail.com>  wrote:
>> On 06/24/2009 10:26 PM, Jon Smirl wrote:
>>> On Wed, Jun 24, 2009 at 5:11 PM, Takashi Iwai<tiwai@suse.de>    wrote:
>>>>> The problem is knowing which sample in the background music to start
>>>>> mixing the low latency laser blast into.
>>>> That's why querying the accurate hwptr is important in PA.
>>> I'm still not convinced that all of this logic should be exposed to
>>> PA. Exposing these details is what makes ALSA hard to use. We should
>>> be able to better isolate user space from this. If mixing were moved
>>> into the kernel these details could be hidden.  The in-kernel code
>>> could then be customized for various sound DMA hardware. This would
>>> also go a long ways toward getting rid of latency issues by removing
>>> the need for real-time response from PA.
>> Mixing really does not belong in the kernel. Moving it there doesn't remove
>> any complication or problem, it just moves it to a different place where
>> it's more difficult to program and less debuggable. Most OSes (Windows
>> included) are moving in the direction of moving mixing out of the kernel,
>> not into it.
>
> Mixing has a real-time component to it. Currently Desktop Linux
> doesn't have real-time support. That's why pulse is developing
> RealTimeKit.  Buggy real-time code can easily lock your machine to
> where you need to hit the reset button.
>
> User space code that is locked down with real-time priority and
> servicing interrupts is effectively kernel code, it might as well be
> in the kernel where it can get rid of the process overhead.
>
> http://git.0pointer.de/?p=rtkit.git;a=blob;f=README

Just because it runs with real time priority does not mean it is 
effectively kernel code, or that it belongs there. Putting the code into 
the kernel adds a bunch of extra challenges for little reason, and also 
locks all users into a mixing scheme that may not meet their needs 
(ahem, Windows kernel mixer..)

^ permalink raw reply	[flat|nested] 18+ messages in thread

end of thread, other threads:[~2009-06-25 23:25 UTC | newest]

Thread overview: 18+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2009-06-24 16:28 Proposal for more reliable audio DMA Mark Brown
2009-06-24 19:07 ` Jon Smirl
2009-06-24 21:11   ` Takashi Iwai
2009-06-25  4:26     ` Jon Smirl
2009-06-25  5:36       ` Robert Hancock
2009-06-25 14:20         ` Jon Smirl
2009-06-25 23:26           ` Robert Hancock
2009-06-25 10:25   ` Mark Brown
  -- strict thread matches above, loose matches on Subject: below --
2009-06-21  2:06 Jon Smirl
2009-06-22 16:27 ` James Courtier-Dutton
2009-06-22 16:43   ` Jon Smirl
2009-06-23  9:54     ` Mark Brown
2009-06-24 14:10       ` Jon Smirl
2009-06-24 14:39         ` Takashi Iwai
2009-06-24 15:14           ` Jon Smirl
2009-06-24 15:24             ` Takashi Iwai
2009-06-24 16:07               ` Jon Smirl
2009-06-25 15:26             ` James Courtier-Dutton

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.