From: Stefan Priebe - Profihost AG <s.priebe@profihost.ag>
To: mdroth <mdroth@linux.vnet.ibm.com>
Cc: qemu-devel@nongnu.org, qemu-stable@nongnu.org,
Anthony Liguori <anthony@codemonkey.ws>,
Luiz Capitulino <lcapitulino@redhat.com>
Subject: Re: [Qemu-devel] [PATCH v2] monitor: work around delayed CHR_EVENT_OPENED events
Date: Thu, 30 May 2013 08:58:44 +0200 [thread overview]
Message-ID: <51A6F8A4.7080706@profihost.ag> (raw)
In-Reply-To: <20130529222946.GH4599@vm>
Am 30.05.2013 00:29, schrieb mdroth:
> On Wed, May 29, 2013 at 01:27:33PM -0400, Luiz Capitulino wrote:
>> On Mon, 27 May 2013 12:59:25 -0500
>> mdroth <mdroth@linux.vnet.ibm.com> wrote:
>>
>>> On Mon, May 27, 2013 at 11:16:01AM -0500, Anthony Liguori wrote:
>>>> Luiz Capitulino <lcapitulino@redhat.com> writes:
>>>>
>>>>> On Sun, 26 May 2013 10:33:39 -0500
>>>>> Michael Roth <mdroth@linux.vnet.ibm.com> wrote:
>>>>>
>>>>>> In the past, CHR_EVENT_OPENED events were emitted via a pre-expired
>>>>>> QEMUTimer. Due to timers being processing at the tail end of each main
>>>>>> loop iteration, this generally meant that such events would be emitted
>>>>>> within the same main loop iteration, prior any client data being read
>>>>>> by tcp/unix socket server backends.
>>>>>>
>>>>>> With 9f939df955a4152aad69a19a77e0898631bb2c18, this was switched to
>>>>>> a "bottom-half" that is registered via g_idle_add(). This makes it
>>>>>> likely that the event won't be sent until a subsequent iteration, and
>>>>>> also adds the possibility that such events will race with the
>>>>>> processing of client data.
>>>>>>
>>>>>> In cases where we rely on the CHR_EVENT_OPENED event being delivered
>>>>>> prior to any client data being read, this may cause unexpected behavior.
>>>>>>
>>>>>> In the case of a QMP monitor session using a socket backend, the delayed
>>>>>> processing of the CHR_EVENT_OPENED event can lead to a situation where
>>>>>> a previous session, where 'qmp_capabilities' has already processed, can
>>>>>> cause the rejection of 'qmp_capabilities' for a subsequent session,
>>>>>> since we reset capabilities in response to CHR_EVENT_OPENED, which may
>>>>>> not yet have been delivered. This can be reproduced with the following
>>>>>> command, generally within 50 or so iterations:
>>>>>>
>>>>>> mdroth@loki:~$ cat cmds
>>>>>> {'execute':'qmp_capabilities'}
>>>>>> {'execute':'query-cpus'}
>>>>>> mdroth@loki:~$ while true; do if socat stdio unix-connect:/tmp/qmp.sock
>>>>>> <cmds | grep CommandNotFound &>/dev/null; then echo failed; break; else
>>>>>> echo ok; fi; done;
>>>>>> ok
>>>>>> ok
>>>>>> failed
>>>>>> mdroth@loki:~$
>>>>>>
>>>>>> As a work-around, we reset capabilities as part of the CHR_EVENT_CLOSED
>>>>>> hook, which gets called as part of the GIOChannel cb associated with the
>>>>>> client rather than a bottom-half, and is thus guaranteed to be delivered
>>>>>> prior to accepting any subsequent client sessions.
>>>>>>
>>>>>> This does not address the more general problem of whether or not there
>>>>>> are chardev users that rely on CHR_EVENT_OPENED being delivered prior to
>>>>>> client data, and whether or not we should consider moving CHR_EVENT_OPENED
>>>>>> "in-band" with connection establishment as a general solution, but fixes
>>>>>> QMP for the time being.
>>>>>>
>>>>>> Reported-by: Stefan Priebe <s.priebe@profihost.ag>
>>>>>> Cc: qemu-stable@nongnu.org
>>>>>> Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
>>>>>
>>>>> Thanks for debugging this Michael. I'm going to apply your patch to the qmp
>>>>> branch because we can't let this broken (esp. in -stable) but this is papering
>>>>> over the real bug...
>>>>
>>>> Do we really need OPENED to happen in a bottom half? Shouldn't the open
>>>> event handlers register the callback instead if they need it?
>>>
>>> I think that's the more general fix, but wasn't sure if there were
>>> historical reasons why we didn't do this in the first place.
>>>
>>> Looking at it more closely it does seem like we need a general fix
>>> sooner rather than later though, and I don't see any reason why we can't
>>> move BH registration into the actual OPENED hooks as you suggest:
>>>
>>> hw/usb/ccid-card-passthru.c
>>> - currently affected? no
>>> debug hook only
>>> - needs OPENED BH? no
>>>
>>> hw/usb/redirect.c
>>> - currently affected? yes
>>> can_read handler checks for dev->parser != NULL, which may be
>>> true if CLOSED BH has been executed yet. In the past, OPENED
>>> quiesced outstanding CLOSED events prior to us reading client data.
>>> - need OPENED BH? possibly
>>> seems to only be needed by CLOSED events, which can be issued by
>>> USBDevice, so we do teardown/usb_detach in BH. For OPENED, this
>>> may not apply. if we do issue a BH, we'd need to modify can_read
>>> handler to avoid race.
>>>
>>> hw/usb/dev-serial.c
>>> - currently affected? no
>>> can_read checks for dev.attached != NULL. set to NULL
>>> synchronously in CLOSED, will not accept reads until OPENED gets
>>> called and sets dev.attached
>>> - need opened BH? no
>>>
>>> hw/char/sclpconsole.c
>>> - currently affected? no
>>> guarded by can_read handler
>>> - need opened BH? no
>>>
>>> hw/char/ipoctal232.c
>>> - currently affected? no
>>> debug hook only
>>> - need opened BH? no
>>>
>>> hw/char/virtio-console.c
>>> - currently affected? no
>>> OPENED/CLOSED map to vq events handled asynchronously. can_read
>>> checks for guest_connected state rather than anything set by OPENED
>>> - need opened BH? no
>>>
>>> qtest.c
>>> - currently affected? yes
>>> can_read handler doesn't check for qtest_opened == true, can read
>>> data before OPENED event is processed
>>> - need opened BH? no
>>>
>>> monitor.c
>>> - current affected? yes
>>> negotiated session state can temporarilly leak into a new session
>>> - need opened BH? no
>>>
>>> gdbstub.c
>>> - currently affected? yes
>>> can fail to pause machine prior to starting gdb session
>>> - need opened BH? no
>>>
>>> So we have a number of cases that can be fixed by avoiding the use of
>>> the generic BH, and only 1 possible cause where we might need a
>>> device-specific BH.
>>>
>>> At first glance anyway. So if this all seems reasonable I can send a
>>> more general fix shortly.
>>
>> Michael, I've dropped your first patch and am taking it that you're
>> going to cook this more general fix.
>
> fix sent:
>
> http://article.gmane.org/gmane.comp.emulators.qemu/213863
thanks seems to work fine
Stefan
prev parent reply other threads:[~2013-05-30 6:59 UTC|newest]
Thread overview: 8+ messages / expand[flat|nested] mbox.gz Atom feed top
2013-05-26 15:33 [Qemu-devel] [PATCH v2] monitor: work around delayed CHR_EVENT_OPENED events Michael Roth
2013-05-27 15:24 ` Luiz Capitulino
2013-05-27 16:16 ` Anthony Liguori
2013-05-27 17:59 ` mdroth
2013-05-29 17:27 ` Luiz Capitulino
2013-05-29 21:03 ` mdroth
2013-05-29 22:29 ` mdroth
2013-05-30 6:58 ` Stefan Priebe - Profihost AG [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=51A6F8A4.7080706@profihost.ag \
--to=s.priebe@profihost.ag \
--cc=anthony@codemonkey.ws \
--cc=lcapitulino@redhat.com \
--cc=mdroth@linux.vnet.ibm.com \
--cc=qemu-devel@nongnu.org \
--cc=qemu-stable@nongnu.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.