From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from eggs.gnu.org ([208.118.235.92]:37179) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1UMz7j-0001a9-4F for qemu-devel@nongnu.org; Tue, 02 Apr 2013 07:11:32 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1UMz7g-0006ds-Kw for qemu-devel@nongnu.org; Tue, 02 Apr 2013 07:11:31 -0400 Received: from mail-qa0-f47.google.com ([209.85.216.47]:56740) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1UMz7g-0006do-HX for qemu-devel@nongnu.org; Tue, 02 Apr 2013 07:11:28 -0400 Received: by mail-qa0-f47.google.com with SMTP id hu16so1358103qab.13 for ; Tue, 02 Apr 2013 04:11:27 -0700 (PDT) Sender: Paolo Bonzini Message-ID: <515ABCD1.2070008@redhat.com> Date: Tue, 02 Apr 2013 13:11:13 +0200 From: Paolo Bonzini MIME-Version: 1.0 References: <1364893452-10604-1-git-send-email-peter.crosthwaite@xilinx.com> In-Reply-To: <1364893452-10604-1-git-send-email-peter.crosthwaite@xilinx.com> Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit Subject: Re: [Qemu-devel] [RFC PATCH] main-loop: Unconditionally unlock iothread List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: Peter Crosthwaite Cc: aliguori@us.ibm.com, qemu-devel@nongnu.org, aurelien@aurel32.net, gson@gson.org Il 02/04/2013 11:04, Peter Crosthwaite ha scritto: > Public bug: 1154328 > Broken Commit: a29753f8aa79a34a324afebe340182a51a5aef11 > > ATM, the timeout from g_pollfds_fill is inhibiting unlocking of the > iothread. This is capable of causing a total deadlock when hw/serial > is used as a device. The bug manifests when you go -nographic -serial > mon:stdio and then paste 40 or more chars into the terminal. > > My knowledge of this g_foo is vague at best, but my best working > theory is this: > > - First 8 chars are recieved by the serial device no complaints. > - The next 32 chars, serial returns false for can_receive() so they > are buffered by the MuxDriver object - mux_chr_read() > - Buffer is full, so 41st char causes false return from Muxes own > can_read() > - This propagates all the way up to glib_pollfds_fill and manifests > as a timeout I suppose you mean "manifests as timeout==0". The question is *which* GSource has a timeout of zero? Not the mux's: if mux_chr_can_read() returns zero, the prepare function returns FALSE without touching the timeout at all... static gboolean io_watch_poll_prepare(GSource *source, gint *timeout_) { IOWatchPoll *iwp = io_watch_poll_from_source(source); iwp->max_size = iwp->fd_can_read(iwp->opaque); if (iwp->max_size == 0) { return FALSE; } return g_io_watch_funcs.prepare(source, timeout_); } > - Timeout means no unlock of IOthread. Device land never sees any more > cycles so the serial port never progresses - no flushing of > buffer Still, this is plausible, so the patch looks correct. Paolo > - Deadlock > > Tested on petalogix_ml605 microblazeel machine model, which was faulty > due to 1154328. > > Fix by removing the conditions on unlocking the iothread. Don't know > what else this will break but the timeout is certainly the wrong > condition for the unlock. Probably the real solution is to have a more > selective unlock policy. > > I'm happy for someone to take this patch off my hands, or educate me on > the correct implementation. For the peeps doing automated testing on > nographic platforms this will get your build working again. > > Signed-off-by: Peter Crosthwaite > --- > main-loop.c | 8 ++------ > 1 files changed, 2 insertions(+), 6 deletions(-) > > diff --git a/main-loop.c b/main-loop.c > index eb80ff3..a376898 100644 > --- a/main-loop.c > +++ b/main-loop.c > @@ -194,15 +194,11 @@ static int os_host_main_loop_wait(uint32_t timeout) > > glib_pollfds_fill(&timeout); > > - if (timeout > 0) { > - qemu_mutex_unlock_iothread(); > - } > + qemu_mutex_unlock_iothread(); > > ret = g_poll((GPollFD *)gpollfds->data, gpollfds->len, timeout); > > - if (timeout > 0) { > - qemu_mutex_lock_iothread(); > - } > + qemu_mutex_lock_iothread(); > > glib_pollfds_poll(); > return ret; >