public inbox for alsa-devel@alsa-project.org
 help / color / mirror / Atom feed
* fixing seq_midi: MIDI output buffer overrun
@ 2018-12-24 20:43 Adam Goode
  2018-12-25 11:34 ` Takashi Sakamoto
  0 siblings, 1 reply; 4+ messages in thread
From: Adam Goode @ 2018-12-24 20:43 UTC (permalink / raw)
  To: alsa-devel

Hi,

I am investigating why Chromium can't reliably send sysex messages on
Linux. (See https://bugs.chromium.org/p/chromium/issues/detail?id=917708.)

I've seen the message at
https://alsa-devel.alsa-project.narkive.com/6w35hffF/sysex-overflow-when-using-the-midi-sequencer-event-interface

Chromium does use seq and we do need sysex to be reliable since
vendors such as Novation and Yamaha use Web MIDI to interface with
hardware.

The code in question predates git history (2005). It's
event_process_midi() and dump_midi() in seq_midi.c.

I see that dump_midi() returns either ENOMEM or EINVAL under certain
conditions, but the only caller (event_process_midi()) throws these
errors away and always returns 0 if everything but dump_midi()
succeeds.

A few questions:

1. Can I change dump_midi to block instead of return ENOMEM if I don't
use SND_SEQ_NONBLOCK in snd_seq_open?

2. What should be returned in NONBLOCK mode? Probably EAGAIN?

3. The general blocking behavior of seq seems inconsistent. If I am
using snd_seq_ev_set_direct, then I get EAGAIN even with a blocking
open. Should I expect blocking from the pool, or only if using a
queue?

4. Is there any obvious fix I am overlooking?


Thanks,

Adam

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: fixing seq_midi: MIDI output buffer overrun
  2018-12-24 20:43 fixing seq_midi: MIDI output buffer overrun Adam Goode
@ 2018-12-25 11:34 ` Takashi Sakamoto
  2018-12-26  1:29   ` Adam Goode
  0 siblings, 1 reply; 4+ messages in thread
From: Takashi Sakamoto @ 2018-12-25 11:34 UTC (permalink / raw)
  To: Adam Goode; +Cc: alsa-devel

Hi,

On Mon, Dec 24, 2018 at 03:43:58PM -0500, Adam Goode wrote:
> I am investigating why Chromium can't reliably send sysex messages on
> Linux. (See https://bugs.chromium.org/p/chromium/issues/detail?id=917708.)
> 
> I've seen the message at
> https://alsa-devel.alsa-project.narkive.com/6w35hffF/sysex-overflow-when-using-the-midi-sequencer-event-interface
> 
> Chromium does use seq and we do need sysex to be reliable since
> vendors such as Novation and Yamaha use Web MIDI to interface with
> hardware.
> 
> The code in question predates git history (2005). It's
> event_process_midi() and dump_midi() in seq_midi.c.
> 
> I see that dump_midi() returns either ENOMEM or EINVAL under certain
> conditions, but the only caller (event_process_midi()) throws these
> errors away and always returns 0 if everything but dump_midi()
> succeeds.

It's difficult to regenerate your issue just according to the above
information. But I guess that chromium process tries to send a batch
of system-exclusive MIDI messages in one operation.

If so, it's easy to bring buffer-overflow in buffers of rawmidi
substream for target sound card (= kernel client of ALSA sequencer).
As a default, the size of buffer is the same as memory page frame[1]
thus the system-exclusive message is 4,096 bytes or more.

On the other hand, physical layer of serial bus for MIDI communication
has 31,250 bits/sec bandwidth. This means that drivers need over 1 sec
to finish transmission of the message.

> A few questions:
> 
> 1. Can I change dump_midi to block instead of return ENOMEM if I don't
> use SND_SEQ_NONBLOCK in snd_seq_open?

This is a callgraph for callers of 'dump_midi()'.

(sound/core/seq/seq_midi.c)
dump_midi()
<-snd_seq_dump_var_event()
  <-event_process_midi()
    (sound/core/seq/seq_clientmgr.c)
    <-snd_seq_deliver_single_event()
      <-broadcast_event()
      <-bounce_error_event()
      <-port_broadcast_event()
      <-deliver_to_subscribers()
        <-snd_seq_deliver_event()
          <-snd_seq_client_enqueue_event()
            <-snd_seq_write()
              <-write(2)
          <-snd_seq_kernel_client_dispatch()
            <-(in-kernel)
          <-snd_seq_dispatch_event()
            (sound/core/seq/seq_queue.c)
            <-snd_seq_check_queue()
              <-snd_seq_enqueue_event()
                <-snd_seq_dispatch_event()
		  <-(loop)
                <-snd_seq_client_enqueue_event()
                  <-snd_seq_write()
		    <-write(2)
                  <-kernel_client_enqueue()
  		    <-(in-kernel)

If changing behaviour of the 'dump_midi()' function, we need to care
all of the above. Especially, in ALSA sequencer subsystem, there're
two types of client; 'kernel' and 'user'. When introducing the
non-blocking behaviour, we need to investigate kernel clients.If any
call on the above graph runs in any IRQ context, usage of task
scheduler can break Linux system.

> 2. What should be returned in NONBLOCK mode? Probably EAGAIN?

In this subsystem, it's natural behaviour to return EAGAIN to callers
with non-blocking mode.

> 3. The general blocking behavior of seq seems inconsistent. If I am
> using snd_seq_ev_set_direct, then I get EAGAIN even with a blocking
> open. Should I expect blocking from the pool, or only if using a
> queue?
>
> 4. Is there any obvious fix I am overlooking?

If this issue should be fixed immediately, it's better to change Chromium
implementation not to write a batch of MIDI messages in one time to
ALSA seq character device, with enough care of narrow bandwidth of
serial bus for MIDI communication.

In this point, usage of queue of ALSA sequencer is one option.

I note that

[1] https://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound.git/tree/sound/core/rawmidi.c#n114


Regards

Takashi Sakamoto

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: fixing seq_midi: MIDI output buffer overrun
  2018-12-25 11:34 ` Takashi Sakamoto
@ 2018-12-26  1:29   ` Adam Goode
  2018-12-26  5:13     ` Takashi Sakamoto
  0 siblings, 1 reply; 4+ messages in thread
From: Adam Goode @ 2018-12-26  1:29 UTC (permalink / raw)
  To: Adam Goode, alsa-devel

Thank you for your detailed response.

On Tue, Dec 25, 2018 at 6:34 AM Takashi Sakamoto
<o-takashi@sakamocchi.jp> wrote:
>
> Hi,
>
> On Mon, Dec 24, 2018 at 03:43:58PM -0500, Adam Goode wrote:
> > I am investigating why Chromium can't reliably send sysex messages on
> > Linux. (See https://bugs.chromium.org/p/chromium/issues/detail?id=917708.)
> >
> > I've seen the message at
> > https://alsa-devel.alsa-project.narkive.com/6w35hffF/sysex-overflow-when-using-the-midi-sequencer-event-interface
> >
> > Chromium does use seq and we do need sysex to be reliable since
> > vendors such as Novation and Yamaha use Web MIDI to interface with
> > hardware.
> >
> > The code in question predates git history (2005). It's
> > event_process_midi() and dump_midi() in seq_midi.c.
> >
> > I see that dump_midi() returns either ENOMEM or EINVAL under certain
> > conditions, but the only caller (event_process_midi()) throws these
> > errors away and always returns 0 if everything but dump_midi()
> > succeeds.
>
> It's difficult to regenerate your issue just according to the above
> information. But I guess that chromium process tries to send a batch
> of system-exclusive MIDI messages in one operation.

An easy way to reproduce is to have amidicat send a sysex file to any
USB MIDI device:
http://krellan.com/amidicat/

amidicat   --port 28:0 test.syx

(You can send anything, including /dev/urandom on stdin.)

amidicat has a purpose similar to Web MIDI in Chromium, it opens in
blocking mode and uses snd_midi_event_encode_byte to convert MIDI
bytes into seq events without doing much inspection on the data
itself.
If you look at write_event() in amidicat.c you can see it tries to
send bytes to seq with as low latency or processing as possible. But
there is a comment in the code:

/* FUTURE: Even though this loop works, it's still too easy to flood
the other end and overrun, perhaps a queuing bug internally within
ALSA? */

Which is probably this same problem.

>
> If so, it's easy to bring buffer-overflow in buffers of rawmidi
> substream for target sound card (= kernel client of ALSA sequencer).
> As a default, the size of buffer is the same as memory page frame[1]
> thus the system-exclusive message is 4,096 bytes or more.
>

When using rawmidi directly, there is no overflow since the user gets
blocked. In ALSA sequencer, dump_midi() first checks rawmidi's
internal buffer and fails if there is not enough space.

> On the other hand, physical layer of serial bus for MIDI communication
> has 31,250 bits/sec bandwidth. This means that drivers need over 1 sec
> to finish transmission of the message.
>
> > A few questions:
> >
> > 1. Can I change dump_midi to block instead of return ENOMEM if I don't
> > use SND_SEQ_NONBLOCK in snd_seq_open?
>
> This is a callgraph for callers of 'dump_midi()'.
>
> (sound/core/seq/seq_midi.c)
> dump_midi()
> <-snd_seq_dump_var_event()
>   <-event_process_midi()
>     (sound/core/seq/seq_clientmgr.c)
>     <-snd_seq_deliver_single_event()
>       <-broadcast_event()
>       <-bounce_error_event()
>       <-port_broadcast_event()
>       <-deliver_to_subscribers()
>         <-snd_seq_deliver_event()
>           <-snd_seq_client_enqueue_event()
>             <-snd_seq_write()
>               <-write(2)
>           <-snd_seq_kernel_client_dispatch()
>             <-(in-kernel)
>           <-snd_seq_dispatch_event()
>             (sound/core/seq/seq_queue.c)
>             <-snd_seq_check_queue()
>               <-snd_seq_enqueue_event()
>                 <-snd_seq_dispatch_event()
>                   <-(loop)
>                 <-snd_seq_client_enqueue_event()
>                   <-snd_seq_write()
>                     <-write(2)
>                   <-kernel_client_enqueue()
>                     <-(in-kernel)
>
> If changing behaviour of the 'dump_midi()' function, we need to care
> all of the above. Especially, in ALSA sequencer subsystem, there're
> two types of client; 'kernel' and 'user'. When introducing the
> non-blocking behaviour, we need to investigate kernel clients.If any
> call on the above graph runs in any IRQ context, usage of task
> scheduler can break Linux system.
>

Yes, maybe I will try to introduce proper error returns before trying
to solve blocking mode.

> > 2. What should be returned in NONBLOCK mode? Probably EAGAIN?
>
> In this subsystem, it's natural behaviour to return EAGAIN to callers
> with non-blocking mode.
>

I think EAGAIN would work for normal events, but sysex uses
snd_seq_dump_var_event() which calls dump_midi in a loop. It is tricky
because user-to-user sysex doesn't have the same size limit that
user-to-rawmidi does. We might be able to return ENOMEM immediately if
the size of the sysex message size exceeds the rawmidi buffer, but
only for rawmidi. This is probably ok since no sysex message sent to
rawmidi through seq can be larger than PAGESIZE already. If we can
ensure that we never return EAGAIN in the middle of sending a sysex,
then that should work ok.

> > 3. The general blocking behavior of seq seems inconsistent. If I am
> > using snd_seq_ev_set_direct, then I get EAGAIN even with a blocking
> > open. Should I expect blocking from the pool, or only if using a
> > queue?
> >
> > 4. Is there any obvious fix I am overlooking?
>
> If this issue should be fixed immediately, it's better to change Chromium
> implementation not to write a batch of MIDI messages in one time to
> ALSA seq character device, with enough care of narrow bandwidth of
> serial bus for MIDI communication.
>

No, I don't think it needs to be fixed immediately. I just happened to
find this myself.

In general, the main workaround for this (used by amidicat and dosbox)
are simply to delay some amount between messages. I don't want to do
that in Chromium because that would complicate the code quite a bit
and since this introduces artificial delay that is not needed for
user-to-user MIDI communications.

> In this point, usage of queue of ALSA sequencer is one option.

>
> I note that
>
> [1] https://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound.git/tree/sound/core/rawmidi.c#n114
>
>
> Regards
>
> Takashi Sakamoto

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: fixing seq_midi: MIDI output buffer overrun
  2018-12-26  1:29   ` Adam Goode
@ 2018-12-26  5:13     ` Takashi Sakamoto
  0 siblings, 0 replies; 4+ messages in thread
From: Takashi Sakamoto @ 2018-12-26  5:13 UTC (permalink / raw)
  To: Adam Goode; +Cc: alsa-devel

Hi,

On Tue, Dec 25, 2018 at 08:29:00PM -0500, Adam Goode wrote:
> An easy way to reproduce is to have amidicat send a sysex file to any
> USB MIDI device:
> http://krellan.com/amidicat/
> 
> amidicat   --port 28:0 test.syx
> 
> (You can send anything, including /dev/urandom on stdin.)
> 
> amidicat has a purpose similar to Web MIDI in Chromium, it opens in
> blocking mode and uses snd_midi_event_encode_byte to convert MIDI
> bytes into seq events without doing much inspection on the data
> itself.
> If you look at write_event() in amidicat.c you can see it tries to
> send bytes to seq with as low latency or processing as possible. But
> there is a comment in the code:
> 
> /* FUTURE: Even though this loop works, it's still too easy to flood
> the other end and overrun, perhaps a queuing bug internally within
> ALSA? */
> 
> Which is probably this same problem.
 
I note that 'How to reproduce' is the most important when taking any issue.
In your case:

1. Prepare a batch of sysex messages over 4096 bytes or more
2. Open user client to ALSA sequencer
3. Write a batch of sysex messages toward rawmidi kernel client of ALSA
   sequencer.
4. Even if returning from the above operation, the peer don't still receive
   all of the messages. Even if enough later, the peer neither yet. This
   is an issue.

The cause is buffer-overflow detection in rawmidi substream. A rawmidi
kernel client of ALSA sequencer discourages data transmission in
this case without any retry.

> > If so, it's easy to bring buffer-overflow in buffers of rawmidi
> > substream for target sound card (= kernel client of ALSA sequencer).
> > As a default, the size of buffer is the same as memory page frame[1]
> > thus the system-exclusive message is 4,096 bytes or more.
> >
> 
> When using rawmidi directly, there is no overflow since the user gets
> blocked. In ALSA sequencer, dump_midi() first checks rawmidi's
> internal buffer and fails if there is not enough space.
>
> > On the other hand, physical layer of serial bus for MIDI communication
> > has 31,250 bits/sec bandwidth. This means that drivers need over 1 sec
> > to finish transmission of the message.
> >
> > > A few questions:
> > >
> > > 1. Can I change dump_midi to block instead of return ENOMEM if I don't
> > > use SND_SEQ_NONBLOCK in snd_seq_open?
> >
> > This is a callgraph for callers of 'dump_midi()'.
> >
> > (sound/core/seq/seq_midi.c)
> > dump_midi()
> > <-snd_seq_dump_var_event()
> >   <-event_process_midi()
> >     (sound/core/seq/seq_clientmgr.c)
> >     <-snd_seq_deliver_single_event()
> >       <-broadcast_event()
> >       <-bounce_error_event()
> >       <-port_broadcast_event()
> >       <-deliver_to_subscribers()
> >         <-snd_seq_deliver_event()
> >           <-snd_seq_client_enqueue_event()
> >             <-snd_seq_write()
> >               <-write(2)
> >           <-snd_seq_kernel_client_dispatch()
> >             <-(in-kernel)
> >           <-snd_seq_dispatch_event()
> >             (sound/core/seq/seq_queue.c)
> >             <-snd_seq_check_queue()
> >               <-snd_seq_enqueue_event()
> >                 <-snd_seq_dispatch_event()
> >                   <-(loop)
> >                 <-snd_seq_client_enqueue_event()
> >                   <-snd_seq_write()
> >                     <-write(2)
> >                   <-kernel_client_enqueue()
> >                     <-(in-kernel)
> >
> > If changing behaviour of the 'dump_midi()' function, we need to care
> > all of the above. Especially, in ALSA sequencer subsystem, there're
> > two types of client; 'kernel' and 'user'. When introducing the
> > non-blocking behaviour, we need to investigate kernel clients.If any
> > call on the above graph runs in any IRQ context, usage of task
> > scheduler can break Linux system.
> >
> 
> Yes, maybe I will try to introduce proper error returns before trying
> to solve blocking mode.
> 
> > > 2. What should be returned in NONBLOCK mode? Probably EAGAIN?
> >
> > In this subsystem, it's natural behaviour to return EAGAIN to callers
> > with non-blocking mode.
> >
> 
> I think EAGAIN would work for normal events, but sysex uses
> snd_seq_dump_var_event() which calls dump_midi in a loop. It is tricky
> because user-to-user sysex doesn't have the same size limit that
> user-to-rawmidi does. We might be able to return ENOMEM immediately if
> the size of the sysex message size exceeds the rawmidi buffer, but
> only for rawmidi. This is probably ok since no sysex message sent to
> rawmidi through seq can be larger than PAGESIZE already. If we can
> ensure that we never return EAGAIN in the middle of sending a sysex,
> then that should work ok.
> 
> > > 3. The general blocking behavior of seq seems inconsistent. If I am
> > > using snd_seq_ev_set_direct, then I get EAGAIN even with a blocking
> > > open. Should I expect blocking from the pool, or only if using a
> > > queue?

I've got it. Then I note that we should take enough care of kernel client,
queue scheduled by kernel timer, and so on. Blocking behaviour with calls
of task scheduler, or propagation of EAGAIN to expect callers to retry
should break nothing, as much as possible. Any test programs (or test
drivers) is preferrable.

> > > 4. Is there any obvious fix I am overlooking?
> >
> > If this issue should be fixed immediately, it's better to change Chromium
> > implementation not to write a batch of MIDI messages in one time to
> > ALSA seq character device, with enough care of narrow bandwidth of
> > serial bus for MIDI communication.
> >
> 
> No, I don't think it needs to be fixed immediately. I just happened to
> find this myself.
> 
> In general, the main workaround for this (used by amidicat and dosbox)
> are simply to delay some amount between messages. I don't want to do
> that in Chromium because that would complicate the code quite a bit
> and since this introduces artificial delay that is not needed for
> user-to-user MIDI communications.

It's natural to have differences between two cases; the peer is rawmidi
kernel client, and the peer is userspace application because userspace
application never has bandwidth limitation to process received MIDI
messages, in theory.

However, if you want to change behaviour of rawmidi client and the
others in ALSA sequencer, please post your patches. I'm willing to
review them.


Regards

Takashi Sakamoto

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2018-12-26  5:14 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2018-12-24 20:43 fixing seq_midi: MIDI output buffer overrun Adam Goode
2018-12-25 11:34 ` Takashi Sakamoto
2018-12-26  1:29   ` Adam Goode
2018-12-26  5:13     ` Takashi Sakamoto

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox