From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from eggs.gnu.org ([208.118.235.92]:56472) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1Uh05d-0005dz-Td for qemu-devel@nongnu.org; Mon, 27 May 2013 12:16:10 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1Uh05c-0004eQ-6B for qemu-devel@nongnu.org; Mon, 27 May 2013 12:16:05 -0400 Received: from mail-ob0-x230.google.com ([2607:f8b0:4003:c01::230]:60271) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1Uh05c-0004eJ-0H for qemu-devel@nongnu.org; Mon, 27 May 2013 12:16:04 -0400 Received: by mail-ob0-f176.google.com with SMTP id v19so707616obq.21 for ; Mon, 27 May 2013 09:16:03 -0700 (PDT) From: Anthony Liguori In-Reply-To: <20130527112450.12eba408@redhat.com> References: <1369582419-28205-1-git-send-email-mdroth@linux.vnet.ibm.com> <20130527112450.12eba408@redhat.com> Date: Mon, 27 May 2013 11:16:01 -0500 Message-ID: <877gikflfi.fsf@codemonkey.ws> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Subject: Re: [Qemu-devel] [PATCH v2] monitor: work around delayed CHR_EVENT_OPENED events List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: Luiz Capitulino , Michael Roth Cc: qemu-stable@nongnu.org, qemu-devel@nongnu.org, s.priebe@profihost.ag Luiz Capitulino writes: > On Sun, 26 May 2013 10:33:39 -0500 > Michael Roth wrote: > >> In the past, CHR_EVENT_OPENED events were emitted via a pre-expired >> QEMUTimer. Due to timers being processing at the tail end of each main >> loop iteration, this generally meant that such events would be emitted >> within the same main loop iteration, prior any client data being read >> by tcp/unix socket server backends. >> >> With 9f939df955a4152aad69a19a77e0898631bb2c18, this was switched to >> a "bottom-half" that is registered via g_idle_add(). This makes it >> likely that the event won't be sent until a subsequent iteration, and >> also adds the possibility that such events will race with the >> processing of client data. >> >> In cases where we rely on the CHR_EVENT_OPENED event being delivered >> prior to any client data being read, this may cause unexpected behavior. >> >> In the case of a QMP monitor session using a socket backend, the delayed >> processing of the CHR_EVENT_OPENED event can lead to a situation where >> a previous session, where 'qmp_capabilities' has already processed, can >> cause the rejection of 'qmp_capabilities' for a subsequent session, >> since we reset capabilities in response to CHR_EVENT_OPENED, which may >> not yet have been delivered. This can be reproduced with the following >> command, generally within 50 or so iterations: >> >> mdroth@loki:~$ cat cmds >> {'execute':'qmp_capabilities'} >> {'execute':'query-cpus'} >> mdroth@loki:~$ while true; do if socat stdio unix-connect:/tmp/qmp.sock >> /dev/null; then echo failed; break; else >> echo ok; fi; done; >> ok >> ok >> failed >> mdroth@loki:~$ >> >> As a work-around, we reset capabilities as part of the CHR_EVENT_CLOSED >> hook, which gets called as part of the GIOChannel cb associated with the >> client rather than a bottom-half, and is thus guaranteed to be delivered >> prior to accepting any subsequent client sessions. >> >> This does not address the more general problem of whether or not there >> are chardev users that rely on CHR_EVENT_OPENED being delivered prior to >> client data, and whether or not we should consider moving CHR_EVENT_OPENED >> "in-band" with connection establishment as a general solution, but fixes >> QMP for the time being. >> >> Reported-by: Stefan Priebe >> Cc: qemu-stable@nongnu.org >> Signed-off-by: Michael Roth > > Thanks for debugging this Michael. I'm going to apply your patch to the qmp > branch because we can't let this broken (esp. in -stable) but this is papering > over the real bug... Do we really need OPENED to happen in a bottom half? Shouldn't the open event handlers register the callback instead if they need it? Regards, Anthony Liguori > >> --- >> v1->v2: >> * remove command_mode reset from CHR_EVENT_OPENED case, since this >> might still cause a race >> >> monitor.c | 2 +- >> 1 file changed, 1 insertion(+), 1 deletion(-) >> >> diff --git a/monitor.c b/monitor.c >> index 6ce2a4e..f1953a0 100644 >> --- a/monitor.c >> +++ b/monitor.c >> @@ -4649,7 +4649,6 @@ static void monitor_control_event(void *opaque, int event) >> >> switch (event) { >> case CHR_EVENT_OPENED: >> - mon->mc->command_mode = 0; >> data = get_qmp_greeting(); >> monitor_json_emitter(mon, data); >> qobject_decref(data); >> @@ -4660,6 +4659,7 @@ static void monitor_control_event(void *opaque, int event) >> json_message_parser_init(&mon->mc->parser, handle_qmp_command); >> mon_refcount--; >> monitor_fdsets_cleanup(); >> + mon->mc->command_mode = 0; >> break; >> } >> }