* A suggestion to solve the ALSA resampling problem.
@ 2005-07-02 12:23 James Courtier-Dutton
2005-07-04 16:35 ` Takashi Iwai
0 siblings, 1 reply; 4+ messages in thread
From: James Courtier-Dutton @ 2005-07-02 12:23 UTC (permalink / raw)
To: alsa-devel
Hi,
There are essentually 2 problems to solve. I will take the alsa code in
the game doom3 as a way to demonstrate the problems.
doom3 uses a sample rate of 44100. Most sound cards now work at 48000 in
hardware, so one needs to sample rate convert between 44100 to 48000 in
software in order to get games like doom3 to play well.
doom3 also opens the buffers at 1024 frames per period, and 4096 frames
per buffer. It then writes 1024 frames at a time to the buffer. If the
complete write fails, it complains and the sound sounds really bad.
This approach is fine in the sound card can do periods of 1024 frames,
but not all sound card hardware can, so we need to find an abstraction
layer in order to deliver these period and buffer sizes to the
application, while also serving the different period and buffer sizes of
the hardware.
So, how about this.
The application can select any buffer size and period size they like.
For simplicity, we might impose that number_of_periods is an integer.
So, the 44100 application buffer is 4096(buffer size),1024(period size)
The hardware 48000 buffer is 2048(buffer size), 512(period size)
We add an intermediate buffer at 48000. This buffer size must be >=
4096*48000/44100. period size must be >= 1024*48000/44100.
If we round up, we get:
intermediate buffer: 4458.23 -> 4459
intermediate period: 1114.56 -> 1115
For ease of transfer between the intermediate buffer and the hardware,
we should really make the intermediate buffer_size = n*hardware period_size.
So, we have the following equations to satify:
intermediate_buffer_size = n * hardware_period_size.
intermediate_buffer_size >=
application_buffer*hardware_rate/application_rate.
intermediate_period_size = hardware_period_size.
This results in the:
intermediate_buffer_size = 4608 (9 * 512)
intermediate_period_size = 512
Each time the sound card hardware interrupts (every 512 frames), alsa
copies the next period from the intermediate buffer to the hardware
buffer. This essentually allows for any sound card hardware to have as
many periods as they wish, and thus as larger buffer size as they like.
With the expense of one extra memcpy.
Each time the application writes to the application buffer, alsa-lib
sample rate converts the number of frames in the snd_pcm_write()
operation, and also writes those to the intermediate buffer.
Add to that some careful handling of position and avail pointers in each
of the buffers. This should result in reliable sample rate/buffer size
converted audio.
The next thing to consider is whether some applications can get sound to
the pcm buffer without using snd_pcm_write(). For example memcpy for
mmap applications. We would need some way to track these memcpy writes,
so that the sample rate converter could take the new samples, and sample
rate convert them into the intermediate buffer.
Does anyone have any ideas regarding this last point? If this last point
never happens in alsa, then I will go ahead and start writing a patch
for alsa to do all this.
James
-------------------------------------------------------
SF.Net email is sponsored by: Discover Easy Linux Migration Strategies
from IBM. Find simple to follow Roadmaps, straightforward articles,
informative Webcasts and more! Get everything you need to get up to
speed, fast. http://ads.osdn.com/?ad_id=7477&alloc_id=16492&op=click
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: A suggestion to solve the ALSA resampling problem.
2005-07-02 12:23 A suggestion to solve the ALSA resampling problem James Courtier-Dutton
@ 2005-07-04 16:35 ` Takashi Iwai
2005-07-05 14:46 ` James Courtier-Dutton
0 siblings, 1 reply; 4+ messages in thread
From: Takashi Iwai @ 2005-07-04 16:35 UTC (permalink / raw)
To: James Courtier-Dutton; +Cc: alsa-devel
At Sat, 02 Jul 2005 13:23:42 +0100,
James Courtier-Dutton wrote:
>
> Hi,
>
> There are essentually 2 problems to solve. I will take the alsa code in
> the game doom3 as a way to demonstrate the problems.
>
> doom3 uses a sample rate of 44100. Most sound cards now work at 48000 in
> hardware, so one needs to sample rate convert between 44100 to 48000 in
> software in order to get games like doom3 to play well.
> doom3 also opens the buffers at 1024 frames per period, and 4096 frames
> per buffer. It then writes 1024 frames at a time to the buffer. If the
> complete write fails, it complains and the sound sounds really bad.
> This approach is fine in the sound card can do periods of 1024 frames,
> but not all sound card hardware can, so we need to find an abstraction
> layer in order to deliver these period and buffer sizes to the
> application, while also serving the different period and buffer sizes of
> the hardware.
>
> So, how about this.
> The application can select any buffer size and period size they like.
> For simplicity, we might impose that number_of_periods is an integer.
>
> So, the 44100 application buffer is 4096(buffer size),1024(period size)
> The hardware 48000 buffer is 2048(buffer size), 512(period size)
>
> We add an intermediate buffer at 48000. This buffer size must be >=
> 4096*48000/44100. period size must be >= 1024*48000/44100.
> If we round up, we get:
> intermediate buffer: 4458.23 -> 4459
> intermediate period: 1114.56 -> 1115
>
> For ease of transfer between the intermediate buffer and the hardware,
> we should really make the intermediate buffer_size = n*hardware period_size.
> So, we have the following equations to satify:
> intermediate_buffer_size = n * hardware_period_size.
> intermediate_buffer_size >=
> application_buffer*hardware_rate/application_rate.
> intermediate_period_size = hardware_period_size.
>
> This results in the:
> intermediate_buffer_size = 4608 (9 * 512)
> intermediate_period_size = 512
The sub-period is a good idea. However, this seems possible only when
you can change the period size of the slave PCM arbitrarily. A
typical bad example is dmix, which has the fixed period/buffer
configuration.
Also, rounding up/down the fraction always introduces the round
error. This may cause a drift in a long run.
> Each time the sound card hardware interrupts (every 512 frames), alsa
> copies the next period from the intermediate buffer to the hardware
> buffer. This essentually allows for any sound card hardware to have as
> many periods as they wish, and thus as larger buffer size as they like.
> With the expense of one extra memcpy.
>
> Each time the application writes to the application buffer, alsa-lib
> sample rate converts the number of frames in the snd_pcm_write()
> operation, and also writes those to the intermediate buffer.
>
> Add to that some careful handling of position and avail pointers in each
> of the buffers. This should result in reliable sample rate/buffer size
> converted audio.
>
> The next thing to consider is whether some applications can get sound to
> the pcm buffer without using snd_pcm_write(). For example memcpy for
> mmap applications. We would need some way to track these memcpy writes,
> so that the sample rate converter could take the new samples, and sample
> rate convert them into the intermediate buffer.
Well, even snd_pcm_read/write() are done in the mmap mode in rate
plugin...
The root of the problems of rate plugin is that we have no proper
interrupts for the given period configuration for the master PCM.
Using the irqs of the slave PCM may have restrictions that can't be
solved easily as mentioned above. Hence, I feel there are only two
solutions for this:
1. Allow variable period size
2. Introduce another interrupt source
1 will be a drastic change to the outside, so it's not an option right
now although it can be a good extension in future.
2 means to extend the kernel side PCM handler to allow artibrary
interrupt source(s). For example, suppose to create a new timer
instance driven by the system timer. Then you can set any "tick"
value for the certain timing.
The timer can be sync'ed with PCM DMA, too. That is, we adjust the
internal tick resolution at each time PCM DMA irq is issued (a la
PLL).
Also, for a better PCM emulation, we may introduce a new timer mode
for tick scheduling. The timer instance has a queue of each wake-up
duration. The user-space writes to the queue and reads in return.
The size of queue corresponds to the number of periods of PCM buffer,
and each queue value corresponds to the period size. You can set
different values to adjust the rounding error.
It's just my $0.02.
Takashi
-------------------------------------------------------
SF.Net email is sponsored by: Discover Easy Linux Migration Strategies
from IBM. Find simple to follow Roadmaps, straightforward articles,
informative Webcasts and more! Get everything you need to get up to
speed, fast. http://ads.osdn.com/?ad_id=7477&alloc_id=16492&op=click
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: A suggestion to solve the ALSA resampling problem.
2005-07-04 16:35 ` Takashi Iwai
@ 2005-07-05 14:46 ` James Courtier-Dutton
2005-07-05 15:12 ` Takashi Iwai
0 siblings, 1 reply; 4+ messages in thread
From: James Courtier-Dutton @ 2005-07-05 14:46 UTC (permalink / raw)
To: Takashi Iwai; +Cc: James Courtier-Dutton, alsa-devel
On 7/4/05, Takashi Iwai <tiwai@suse.de> wrote:
> At Sat, 02 Jul 2005 13:23:42 +0100,
> James Courtier-Dutton wrote:
> >
> > Hi,
> >
> > There are essentually 2 problems to solve. I will take the alsa code in
> > the game doom3 as a way to demonstrate the problems.
> >
> > doom3 uses a sample rate of 44100. Most sound cards now work at 48000 in
> > hardware, so one needs to sample rate convert between 44100 to 48000 in
> > software in order to get games like doom3 to play well.
> > doom3 also opens the buffers at 1024 frames per period, and 4096 frames
> > per buffer. It then writes 1024 frames at a time to the buffer. If the
> > complete write fails, it complains and the sound sounds really bad.
> > This approach is fine in the sound card can do periods of 1024 frames,
> > but not all sound card hardware can, so we need to find an abstraction
> > layer in order to deliver these period and buffer sizes to the
> > application, while also serving the different period and buffer sizes of
> > the hardware.
> >
> > So, how about this.
> > The application can select any buffer size and period size they like.
> > For simplicity, we might impose that number_of_periods is an integer.
> >
> > So, the 44100 application buffer is 4096(buffer size),1024(period size)
> > The hardware 48000 buffer is 2048(buffer size), 512(period size)
> >
> > We add an intermediate buffer at 48000. This buffer size must be >=
> > 4096*48000/44100. period size must be >= 1024*48000/44100.
> > If we round up, we get:
> > intermediate buffer: 4458.23 -> 4459
> > intermediate period: 1114.56 -> 1115
> >
> > For ease of transfer between the intermediate buffer and the hardware,
> > we should really make the intermediate buffer_size = n*hardware period_size.
> > So, we have the following equations to satify:
> > intermediate_buffer_size = n * hardware_period_size.
> > intermediate_buffer_size >=
> > application_buffer*hardware_rate/application_rate.
> > intermediate_period_size = hardware_period_size.
> >
> > This results in the:
> > intermediate_buffer_size = 4608 (9 * 512)
> > intermediate_period_size = 512
>
> The sub-period is a good idea. However, this seems possible only when
> you can change the period size of the slave PCM arbitrarily. A
> typical bad example is dmix, which has the fixed period/buffer
> configuration.
Which is the slave, and which is the master?
master == hardware buffer, slave == application buffer
or
master == application buffer, slave == hardware buffer.
The current method used in alsa-lib is to try to map the applications
period directly onto the hardware period while changing sample rate at
the same time. I am suggesting adding a new buffer in between the two,
so that the application period writes into the intermediate buffer one
application period at a time, and the intermediate buffer writes into
the hardware buffer one hardware period at a time. So, there are 3
buffers here.
1) hardware buffer at 48000Hz.
2) intermediate buffer at 48000Hz.
3) Application buffer at 44100Hz.
The application buffer free runs and uses timer interrupts to trigger periods.
The application buffer has it's own pointer so can tell the
application the status of the buffer.
Each time the application writes to the application buffer, it fills
the application buffer and updates the application hw pointers. It
also runs the snd_pcm_write() through a sample rate converter, that
results in a varying amount of frames written to the intermediate
buffer.
The hardware buffer free runs and transfers data from the intermediate
buffer each time the hardware period ellapses.
So, the application sees a constant period size, the hardware also
sees a constant period size (but different from the application one).
The intermediate buffer is used to buffer between the two.
>
> Also, rounding up/down the fraction always introduces the round
> error. This may cause a drift in a long run.
The size of the intermediate buffer does not matter. It just has to be
bigger than the application wants. The resampler in the
snd_pcm_write() function will be using fractions, so the rounding of
the buffer size is not a problem as there is no requirement to map one
application period directly into one intermediate period.
>
> > Each time the sound card hardware interrupts (every 512 frames), alsa
> > copies the next period from the intermediate buffer to the hardware
> > buffer. This essentually allows for any sound card hardware to have as
> > many periods as they wish, and thus as larger buffer size as they like.
> > With the expense of one extra memcpy.
> >
> > Each time the application writes to the application buffer, alsa-lib
> > sample rate converts the number of frames in the snd_pcm_write()
> > operation, and also writes those to the intermediate buffer.
> >
> > Add to that some careful handling of position and avail pointers in each
> > of the buffers. This should result in reliable sample rate/buffer size
> > converted audio.
> >
> > The next thing to consider is whether some applications can get sound to
> > the pcm buffer without using snd_pcm_write(). For example memcpy for
> > mmap applications. We would need some way to track these memcpy writes,
> > so that the sample rate converter could take the new samples, and sample
> > rate convert them into the intermediate buffer.
>
> Well, even snd_pcm_read/write() are done in the mmap mode in rate
> plugin...
Good.
>
>
> The root of the problems of rate plugin is that we have no proper
> interrupts for the given period configuration for the master PCM.
> Using the irqs of the slave PCM may have restrictions that can't be
> solved easily as mentioned above. Hence, I feel there are only two
> solutions for this:
>
> 1. Allow variable period size
> 2. Introduce another interrupt source
>
> 1 will be a drastic change to the outside, so it's not an option right
> now although it can be a good extension in future.
I also think that this probably causes more problems than it helps
with, even though I actually suggested it some time ago.
>
> 2 means to extend the kernel side PCM handler to allow artibrary
> interrupt source(s). For example, suppose to create a new timer
> instance driven by the system timer. Then you can set any "tick"
> value for the certain timing.
Some sound cards, like the Audigy and ca0106 have "wall clocks" that
we can generate interrupts from, if we wish. This would be a good way
to get interrupts arriving in sync with the sound card hardware. So,
we would have 2 interrupts.
1) For period_elapsed() from the hardware buffer period interrupt.
(The current interrupt)
2) For period_elapsed() from the timer for the intermediate period
trigger. (From wall-clock, but I am not sure how many sound cards have
that feature.
>
> The timer can be sync'ed with PCM DMA, too. That is, we adjust the
> internal tick resolution at each time PCM DMA irq is issued (a la
> PLL).
Good method if the sound card hardware does not support "wall clock"
>
> Also, for a better PCM emulation, we may introduce a new timer mode
> for tick scheduling. The timer instance has a queue of each wake-up
> duration. The user-space writes to the queue and reads in return.
> The size of queue corresponds to the number of periods of PCM buffer,
> and each queue value corresponds to the period size. You can set
> different values to adjust the rounding error.
>
>
> It's just my $0.02.
>
> Takashi
>
James
-------------------------------------------------------
SF.Net email is sponsored by: Discover Easy Linux Migration Strategies
from IBM. Find simple to follow Roadmaps, straightforward articles,
informative Webcasts and more! Get everything you need to get up to
speed, fast. http://ads.osdn.com/?ad_idt77&alloc_id\x16492&op=click
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: A suggestion to solve the ALSA resampling problem.
2005-07-05 14:46 ` James Courtier-Dutton
@ 2005-07-05 15:12 ` Takashi Iwai
0 siblings, 0 replies; 4+ messages in thread
From: Takashi Iwai @ 2005-07-05 15:12 UTC (permalink / raw)
To: James Courtier-Dutton; +Cc: alsa-devel
At Tue, 5 Jul 2005 15:46:13 +0100,
James Courtier-Dutton wrote:
>
> On 7/4/05, Takashi Iwai <tiwai@suse.de> wrote:
> > At Sat, 02 Jul 2005 13:23:42 +0100,
> > James Courtier-Dutton wrote:
> > >
> > > Hi,
> > >
> > > There are essentually 2 problems to solve. I will take the alsa code in
> > > the game doom3 as a way to demonstrate the problems.
> > >
> > > doom3 uses a sample rate of 44100. Most sound cards now work at 48000 in
> > > hardware, so one needs to sample rate convert between 44100 to 48000 in
> > > software in order to get games like doom3 to play well.
> > > doom3 also opens the buffers at 1024 frames per period, and 4096 frames
> > > per buffer. It then writes 1024 frames at a time to the buffer. If the
> > > complete write fails, it complains and the sound sounds really bad.
> > > This approach is fine in the sound card can do periods of 1024 frames,
> > > but not all sound card hardware can, so we need to find an abstraction
> > > layer in order to deliver these period and buffer sizes to the
> > > application, while also serving the different period and buffer sizes of
> > > the hardware.
> > >
> > > So, how about this.
> > > The application can select any buffer size and period size they like.
> > > For simplicity, we might impose that number_of_periods is an integer.
> > >
> > > So, the 44100 application buffer is 4096(buffer size),1024(period size)
> > > The hardware 48000 buffer is 2048(buffer size), 512(period size)
> > >
> > > We add an intermediate buffer at 48000. This buffer size must be >=
> > > 4096*48000/44100. period size must be >= 1024*48000/44100.
> > > If we round up, we get:
> > > intermediate buffer: 4458.23 -> 4459
> > > intermediate period: 1114.56 -> 1115
> > >
> > > For ease of transfer between the intermediate buffer and the hardware,
> > > we should really make the intermediate buffer_size = n*hardware period_size.
> > > So, we have the following equations to satify:
> > > intermediate_buffer_size = n * hardware_period_size.
> > > intermediate_buffer_size >=
> > > application_buffer*hardware_rate/application_rate.
> > > intermediate_period_size = hardware_period_size.
> > >
> > > This results in the:
> > > intermediate_buffer_size = 4608 (9 * 512)
> > > intermediate_period_size = 512
> >
> > The sub-period is a good idea. However, this seems possible only when
> > you can change the period size of the slave PCM arbitrarily. A
> > typical bad example is dmix, which has the fixed period/buffer
> > configuration.
> Which is the slave, and which is the master?
> master == hardware buffer, slave == application buffer
> or
> master == application buffer, slave == hardware buffer.
In the alsa-lib sense, slave PCM = hardware buffer, master =
application.
> The current method used in alsa-lib is to try to map the applications
> period directly onto the hardware period while changing sample rate at
> the same time. I am suggesting adding a new buffer in between the two,
> so that the application period writes into the intermediate buffer one
> application period at a time, and the intermediate buffer writes into
> the hardware buffer one hardware period at a time. So, there are 3
> buffers here.
> 1) hardware buffer at 48000Hz.
> 2) intermediate buffer at 48000Hz.
> 3) Application buffer at 44100Hz.
> The application buffer free runs and uses timer interrupts to trigger periods.
Note that the current rate plugin already has an intermediate
buffer. The samples are transferred in asynchrnous way.
The problem is that rate plugin prefers to transfer the chunk of data
as a whole period. This limitation could be solved if we have
arbitrary interrupt source.
> The application buffer has it's own pointer so can tell the
> application the status of the buffer.
> Each time the application writes to the application buffer, it fills
> the application buffer and updates the application hw pointers. It
> also runs the snd_pcm_write() through a sample rate converter, that
> results in a varying amount of frames written to the intermediate
> buffer.
> The hardware buffer free runs and transfers data from the intermediate
> buffer each time the hardware period ellapses.
>
> So, the application sees a constant period size, the hardware also
> sees a constant period size (but different from the application one).
> The intermediate buffer is used to buffer between the two.
>
> >
> > Also, rounding up/down the fraction always introduces the round
> > error. This may cause a drift in a long run.
> The size of the intermediate buffer does not matter. It just has to be
> bigger than the application wants. The resampler in the
> snd_pcm_write() function will be using fractions, so the rounding of
> the buffer size is not a problem as there is no requirement to map one
> application period directly into one intermediate period.
The whole point here is the "period size", not the buffer size.
Yes, the buffer size doesn't matter. But the period size does.
The period size defines the interrupt period, and ALSA assumes the
constant period size on the hardware side. Since 44.1khz can't be
aligned in integer to 48kHz, there must be a drift as long as you
handle the constant integer period size (that is, if you take the irq
only from the soundcard period boundary, as normal soundcards
provide.)
> > > Each time the sound card hardware interrupts (every 512 frames), alsa
> > > copies the next period from the intermediate buffer to the hardware
> > > buffer. This essentually allows for any sound card hardware to have as
> > > many periods as they wish, and thus as larger buffer size as they like.
> > > With the expense of one extra memcpy.
> > >
> > > Each time the application writes to the application buffer, alsa-lib
> > > sample rate converts the number of frames in the snd_pcm_write()
> > > operation, and also writes those to the intermediate buffer.
> > >
> > > Add to that some careful handling of position and avail pointers in each
> > > of the buffers. This should result in reliable sample rate/buffer size
> > > converted audio.
> > >
> > > The next thing to consider is whether some applications can get sound to
> > > the pcm buffer without using snd_pcm_write(). For example memcpy for
> > > mmap applications. We would need some way to track these memcpy writes,
> > > so that the sample rate converter could take the new samples, and sample
> > > rate convert them into the intermediate buffer.
> >
> > Well, even snd_pcm_read/write() are done in the mmap mode in rate
> > plugin...
> Good.
>
> >
> >
> > The root of the problems of rate plugin is that we have no proper
> > interrupts for the given period configuration for the master PCM.
> > Using the irqs of the slave PCM may have restrictions that can't be
> > solved easily as mentioned above. Hence, I feel there are only two
> > solutions for this:
> >
> > 1. Allow variable period size
> > 2. Introduce another interrupt source
> >
> > 1 will be a drastic change to the outside, so it's not an option right
> > now although it can be a good extension in future.
> I also think that this probably causes more problems than it helps
> with, even though I actually suggested it some time ago.
>
> >
> > 2 means to extend the kernel side PCM handler to allow artibrary
> > interrupt source(s). For example, suppose to create a new timer
> > instance driven by the system timer. Then you can set any "tick"
> > value for the certain timing.
> Some sound cards, like the Audigy and ca0106 have "wall clocks" that
> we can generate interrupts from, if we wish. This would be a good way
> to get interrupts arriving in sync with the sound card hardware. So,
> we would have 2 interrupts.
> 1) For period_elapsed() from the hardware buffer period interrupt.
> (The current interrupt)
> 2) For period_elapsed() from the timer for the intermediate period
> trigger. (From wall-clock, but I am not sure how many sound cards have
> that feature.
Remember that the interrupt source must be multiplex, too.
For example, dmix would require an indepedent interrupt for each
instance in a certain accuracy.
Takashi
-------------------------------------------------------
SF.Net email is sponsored by: Discover Easy Linux Migration Strategies
from IBM. Find simple to follow Roadmaps, straightforward articles,
informative Webcasts and more! Get everything you need to get up to
speed, fast. http://ads.osdn.com/?ad_id=7477&alloc_id=16492&op=click
^ permalink raw reply [flat|nested] 4+ messages in thread
end of thread, other threads:[~2005-07-05 15:12 UTC | newest]
Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2005-07-02 12:23 A suggestion to solve the ALSA resampling problem James Courtier-Dutton
2005-07-04 16:35 ` Takashi Iwai
2005-07-05 14:46 ` James Courtier-Dutton
2005-07-05 15:12 ` Takashi Iwai
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.