All of lore.kernel.org
 help / color / mirror / Atom feed
* Re: Re: dmix plugin
@ 2003-02-17 22:28 Jaroslaw Sobierski
  0 siblings, 0 replies; 50+ messages in thread
From: Jaroslaw Sobierski @ 2003-02-17 22:28 UTC (permalink / raw)
  To: perex; +Cc: T.Motylewski, abramo.bagnara, alsa-devel


>> On Mon, 17 Feb 2003, Jaroslav Kysela wrote:
>> 
>> > Note that your all nice ideas go to some blind alley. Who will silence the 
>> > sum buffer? Driver silences only hardware buffer which will not be used 
>> > for the calculation in your algorithm.
>> 
>> Silencing is not time critical, if buffer is big enough it does not matter
>> whether is it done 1 ms or 100 ms after the card has played the data. Therefore
>> it may be done by a separate thread/process/kernel task without any
>> interference with other processes writing to the buffer.
>
>It is time critical for the dmix plugin, because other processes might 
>write new samples to "empty" areas.
>

Clearing the sum buffer would be a task analogous, or I should probably say
reverse, to the saturation operation. You see, before you take the value in
the sum buffer and add your sample and so forth, you can check if the 
destination sample in the DMA buffer is zero. If it is, you disregard the
value in the sum (it is now considered stale), overwrite it with your sample
and proceed to saturate it normally. If another thread has already written
something there - the final buffer will be non-zero, and you proceed as
discussed before, if another thread has written zeroes,or the result has
summed up to zero - it still doesn't matter, because then the sum buffer 
would also have to contain a zero so it is right to disregard it's value. 
And that's it. OK, some synchronization would be in order so that you don't 
kill a sample just written by some other thread as in:

A                         B
check hw buff 0? yes
                          check hw buff 0? yes
                          write B sample to sum/hw
write A sample to sum/hw

A re-read after the write does not solve a problem this time, because
thread B could (though it is very unlikely) have the same sample value.
But I'm sure we can come up with something for this.

That said, I still think it would be a better solution altogether to have
a buffer in an alsa-native not hardware-native format and have the driver
do the translation / saturation and the like. Yeah, I know that's not what
you want, I got it ;-).

--------------
Fycio (J.Sobierski)
 fycio@gucio.com


-------------------------------------------------------
This sf.net email is sponsored by:ThinkGeek
Welcome to geek heaven.
http://thinkgeek.com/sf

^ permalink raw reply	[flat|nested] 50+ messages in thread
* Re: Re: dmix plugin
@ 2003-02-17 16:18 Jaroslaw Sobierski
  0 siblings, 0 replies; 50+ messages in thread
From: Jaroslaw Sobierski @ 2003-02-17 16:18 UTC (permalink / raw)
  To: T.Motylewski; +Cc: perex, abramo.bagnara, alsa-devel

>
>Well, but when adding a+b we have no idea that that overlow will be compensated
>by next very big negative sample. Also mixing signals which already fill 90% of
>dynamic range is not a good idea. My "fix" is heuristic - it works for
>occasional _small_ overflows like 0x4100+0x4000 -> 0x7fff is much better than
>0x8100. 
>
>The idea of dmix as I understand it is that buffer is already in the native
>format for the sound card. So if sound card supports 24 bit, OK. But then
>people will start mixing 24 bit samples :-)
>
>> AFAIK most hardware does not mix by reducing volume before the sum. On the
>> contrary, it is usually summed "as is" to a wider register, and often even so
>
>And here our "wider register" is 16bit. That means end users should not expect
>too much if thay mix full power signals on it.
>
>BTW. If you have uncorrelated signals, then to mix 4 signals it may be good
>enough to reduce the amplitude of them just factor 2, because power will drop
>factor 4. Ocassionally there will be overrruns, but 0x7fff limit will make it
>almost not hearable. Not a correct fix, but I can assure you that it works in
>standard cases :-)

That's a good point. As long as we're dealing with 2 or 3 channels we probably
can do with saturating. But we should consider adding a shift right by one
(after adding, before saturation) once we have 4 channels, by two at 8 
channels, or something similar.
Otherwise we will start getting some ugly clipping artifacts. The problem is,
this can cause a (noticable) sudden drop in volume when a "threshold" client
connects/disconnects. We could ramp, but that's a multiplication...

--------------
Fycio (J.Sobierski)
 fycio@gucio.com


-------------------------------------------------------
This sf.net email is sponsored by:ThinkGeek
Welcome to geek heaven.
http://thinkgeek.com/sf

^ permalink raw reply	[flat|nested] 50+ messages in thread
* Re: Re: dmix plugin
@ 2003-02-17 15:32 Jaroslaw Sobierski
  2003-02-17 19:45 ` Jaroslav Kysela
  0 siblings, 1 reply; 50+ messages in thread
From: Jaroslaw Sobierski @ 2003-02-17 15:32 UTC (permalink / raw)
  To: abramo.bagnara; +Cc: perex, alsa-devel

>> I see, the read/saturate/write must be atomic, too. In this case, it would
>> be better to use a global (or a set of) mutex(es) to lock the hardware
>> ring buffer. The futexes are nice.
>
>They are nice indeed, but definitely not the right solution here.
>
>Although I don't know if it's the absolute best solution, the 'retry'
>approach I've proposed is far better and much more efficient.

I have to agree with Abramo. A global mutex would cause long and unnecessary 
waits for the processes trying to write to the plugin. Locking access to
individual parts of the buffer is messy. Notice that concurrent writes 
to the same sample in the buffer will occur sporadically, and the "re-read"
in the loop costs almost nothing, while synchronization mechanisms could 
block often.

--------------
Fycio (J.Sobierski)
 fycio@gucio.com


-------------------------------------------------------
This sf.net email is sponsored by:ThinkGeek
Welcome to geek heaven.
http://thinkgeek.com/sf

^ permalink raw reply	[flat|nested] 50+ messages in thread
* Re: Re: dmix plugin
@ 2003-02-17 13:12 Jaroslaw Sobierski
  2003-02-17 13:22 ` Jaroslav Kysela
  2003-02-17 13:24 ` Jaroslav Kysela
  0 siblings, 2 replies; 50+ messages in thread
From: Jaroslaw Sobierski @ 2003-02-17 13:12 UTC (permalink / raw)
  To: abramo.bagnara; +Cc: perex, alsa-devel

Abramo Bagnara wrote:
>If we'd need to use an intermediate buffer and a mixing thread, the dmix
>approach lose our interest.
>
>A solution might be to have a shared parallel sw ring buffer where to
>store the exact value:
>
>        xadd(sw, *src);
>	do {
>		v = *sw;
>		if (v > 0x7fff)
>			s = 0x7fff;
>		else if (v < -0x8000)
>			s = -0x8000;
>		else
>	     		s = v;
>		*hw = v;
>	} while (unlikely(v != *sw));
>	
>This should solve also the atomicity update.

Very true, and it is consistent with what
Jaroslav Kysela wrote:
> My point was that all processes operates simultaneously and independently.  
> So if one process updates area in the "sum" ring buffer, then it MUST
> transfer changed area (with saturation) to the DMA buffer. So there is no 
> "once saturation" as you think. Anyway, the current implementation uses 
> also saturation for all clients (processes) so the only drawback is the 
> additional access to the "sum" ring buffer memory area.

So it seems like a good compromise to solve all our problems :-). 

Still, don't we already *have* a feeding thread for the sound card? I mean
it doesn't just grab the memory buffer all by itself whenever it wants?
Excuse my ignorance on this topic I'm only just starting with ALSA, and I
did not have the time yet to go through the entire source code ;-).
I remember when I was writing a driver for an mpeg2 decoder card that I
had to create 2 threads, one for feeding video and one for audio. The
FIFO level was checked either by polling or via interrupt handlers but
I still had control over what and when is transferred. I could let the
card pull the data via DMA using bus mastering but I still new what 
and from where will be sent...
Does the problem lie in the fact that it is actually a plugin and has
no control of the transfer? Maybe it would be worth considering a callback
for the plugin from the main alsa module to infrom it that a new piece
of the DMA buffer must be "prepared" whatever that could mean for a
particular plugin. Anyway, just a thought.

--------------
Fycio (J.Sobierski)
 fycio@gucio.com


-------------------------------------------------------
This sf.net email is sponsored by:ThinkGeek
Welcome to geek heaven.
http://thinkgeek.com/sf

^ permalink raw reply	[flat|nested] 50+ messages in thread
* Re: Re: dmix plugin
@ 2003-02-17 11:18 Jaroslaw Sobierski
  2003-02-17 11:53 ` Jaroslav Kysela
  0 siblings, 1 reply; 50+ messages in thread
From: Jaroslaw Sobierski @ 2003-02-17 11:18 UTC (permalink / raw)
  To: perex; +Cc: alsa-devel

>> In our case, such "solution" would have to affect the whole buffer, meaning 
>> we would need 3 (or better yet 4) bytes per sample, which would eventually get
>> reduced back to 2 bytes on the way out to the sound card. This seems neither
>> elegant nor memory efficient but would work, and also solves the "a)" problem
>> because we don't need to saturate so an atomic add can be performed on each
>> sample. 
>
>Yes, this solution is good. I've though about it, too. Unfortunately, it 
>adds additional transfers including saturation from the "sum" ring buffer 
>to the DMA buffer of hardware.

Hmmm... Not exactly. This is not a problem. First of all: it is way
better to saturate once (i.e. just before the transfer) since this is
a costly operation involving a conditional jump (unless you optimize for
mmx) than do it for every channel individually. If you're mixing 4
channels you do it once, not 4 times. Just because you need to store the 
result in a different buffer, rather than putting it in it's original 
place seems hardly a big difference (except for cache hits maybe).

Also, if you insist on sparing memory (the buffer is not *that*
big is it?) you can lay it out as two separate (ring) buffers, one 
holding upper words, the other holding lower words. Now instead of 
shifting the samples right n-bits before adding to the buffer, you 
shift them left 16-n. In effect you will get a buffer (the upper part) 
which can be directly sent to the audio hw, and which was summed and 
divided without losing precision. The drawback is you lose the atomic 
add. If you don't shift, you can still saturate into the "upper" buffer 
and DMA from there.


--------------
Fycio (J.Sobierski)
 fycio@gucio.com


-------------------------------------------------------
This sf.net email is sponsored by:ThinkGeek
Welcome to geek heaven.
http://thinkgeek.com/sf

^ permalink raw reply	[flat|nested] 50+ messages in thread
* Re: dmix plugin
@ 2003-02-17 10:04 Jaroslaw Sobierski
  2003-02-17 10:15 ` Jaroslav Kysela
  2003-02-17 10:32 ` tomasz motylewski
  0 siblings, 2 replies; 50+ messages in thread
From: Jaroslaw Sobierski @ 2003-02-17 10:04 UTC (permalink / raw)
  To: alsa-devel

> > b) sum overflow: we can lower volume of samples before sum; I think that
> >    hardware works in this way, too
> 
> Here I don't understand you. Suppose we have 3 samples to mix:
> a = 0x7500
> b = 0x7400
> c = 0x8300
> 
> If you do a + b + c (in this order) you obtain:
> d=0
> d += a -> 7500
> d += b -> 0xe900 -> 0x7fff
> d += c -> 0x02ff
> 
> while the correct result is 0x6c00. You see?

AFAIK most hardware does not mix by reducing volume before the sum. On the
contrary, it is usually summed "as is" to a wider register, and often even so
used. For example, a sound card able to mix 16 chanels of 16 bits would have
a 16+4 bits or 24 bit register were the channels are added and no saturation
can occur. In good hardware this would not even be downscaled back to 16 bits,
but a 24 bit D/A converter would be used instead. In older times (Gravis Ultra
Sound and I think older SB AWE) this could easily be spotted by the difference
in supported "hardware" channels and "software" channels. A card with a 32 bit
sum register and 24 bit DA could support (as above) 16 hardware channels and 
for example 64 software channels (mixed together in quadrouplets to the 16 hw).

In our case, such "solution" would have to affect the whole buffer, meaning 
we would need 3 (or better yet 4) bytes per sample, which would eventually get
reduced back to 2 bytes on the way out to the sound card. This seems neither
elegant nor memory efficient but would work, and also solves the "a)" problem
because we don't need to saturate so an atomic add can be performed on each
sample. 

--------------
Fycio (J.Sobierski)
 fycio@gucio.com


-------------------------------------------------------
This sf.net email is sponsored by:ThinkGeek
Welcome to geek heaven.
http://thinkgeek.com/sf

^ permalink raw reply	[flat|nested] 50+ messages in thread

end of thread, other threads:[~2003-02-21 14:08 UTC | newest]

Thread overview: 50+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2003-02-17 22:28 Re: dmix plugin Jaroslaw Sobierski
  -- strict thread matches above, loose matches on Subject: below --
2003-02-17 16:18 Jaroslaw Sobierski
2003-02-17 15:32 Jaroslaw Sobierski
2003-02-17 19:45 ` Jaroslav Kysela
2003-02-17 20:44   ` tomasz motylewski
2003-02-17 20:59     ` Jaroslav Kysela
2003-02-18 10:00   ` Abramo Bagnara
2003-02-18 12:52     ` Jaroslav Kysela
2003-02-18 13:10       ` Jaroslaw Sobierski
2003-02-18 13:19         ` Jaroslav Kysela
2003-02-18 14:51       ` Paul Davis
2003-02-18 16:51         ` Jaroslav Kysela
2003-02-18 21:07     ` Jaroslav Kysela
2003-02-19 10:20       ` Abramo Bagnara
2003-02-19 11:01         ` Jaroslav Kysela
2003-02-19 11:17           ` Abramo Bagnara
2003-02-19 13:49             ` Abramo Bagnara
2003-02-19 15:45               ` Jaroslaw Sobierski
2003-02-19 20:39                 ` Abramo Bagnara
2003-02-19 18:34               ` Jaroslav Kysela
2003-02-19 21:24                 ` Jaroslav Kysela
2003-02-20  8:28                 ` Abramo Bagnara
2003-02-20  8:30                 ` Jaroslaw Sobierski
2003-02-20  8:48                   ` Abramo Bagnara
2003-02-20  8:53                 ` Abramo Bagnara
2003-02-20 16:49                   ` Jaroslav Kysela
2003-02-20 17:57                     ` Abramo Bagnara
2003-02-20 18:26                       ` Paul Davis
2003-02-20 22:14                         ` Abramo Bagnara
2003-02-20 19:55                       ` Jaroslav Kysela
2003-02-20 21:19                         ` tomasz motylewski
2003-02-20 21:27                           ` Jaroslav Kysela
2003-02-21 10:25                         ` Abramo Bagnara
2003-02-21 14:08                         ` Jaroslaw Sobierski
2003-02-19 10:33       ` Jaroslaw Sobierski
2003-02-19 11:08         ` Jaroslav Kysela
2003-02-17 13:12 Jaroslaw Sobierski
2003-02-17 13:22 ` Jaroslav Kysela
2003-02-17 18:15   ` Paul Davis
2003-02-18 22:36     ` Abramo Bagnara
2003-02-17 13:24 ` Jaroslav Kysela
2003-02-17 11:18 Jaroslaw Sobierski
2003-02-17 11:53 ` Jaroslav Kysela
2003-02-17 10:04 Jaroslaw Sobierski
2003-02-17 10:15 ` Jaroslav Kysela
2003-02-17 12:15   ` Abramo Bagnara
2003-02-17 13:12     ` Jaroslav Kysela
2003-02-17 13:29       ` Abramo Bagnara
2003-02-17 15:00         ` Jaroslav Kysela
2003-02-17 15:21           ` Abramo Bagnara
2003-02-17 10:32 ` tomasz motylewski

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.