From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from eggs.gnu.org ([208.118.235.92]:35576) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1UJ9g7-0006kR-EB for qemu-devel@nongnu.org; Fri, 22 Mar 2013 17:39:13 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1UJ9g6-0006ZS-4y for qemu-devel@nongnu.org; Fri, 22 Mar 2013 17:39:11 -0400 Received: from mx1.redhat.com ([209.132.183.28]:11926) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1UJ9g5-0006ZM-S2 for qemu-devel@nongnu.org; Fri, 22 Mar 2013 17:39:10 -0400 Date: Fri, 22 Mar 2013 17:39:04 -0400 From: Luiz Capitulino Message-ID: <20130322173904.66d2f5ce@doriath> In-Reply-To: <20130322165039.32aae1fb@doriath> References: <514C21C6.3070800@greensocs.com> <20130322165039.32aae1fb@doriath> Mime-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: quoted-printable Subject: Re: [Qemu-devel] Abort in monitor_puts. List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: Luiz Capitulino Cc: Anthony Liguori , kraxel@redhat.com, qemu-devel , KONRAD =?UTF-8?B?RnLDqWTDqXJpYw==?= On Fri, 22 Mar 2013 16:50:39 -0400 Luiz Capitulino wrote: > On Fri, 22 Mar 2013 10:17:58 +0100 > KONRAD Fr=C3=A9d=C3=A9ric wrote: >=20 > > Hi, > >=20 > > Seems there is an issue with the current git (found by toddf on IRC). > >=20 > > To reproduce: > >=20 > > ./qemu-system-x86_64 --monitor stdio --nographic > >=20 > > and put "?" it should abort. > >=20 > > Here is the backtrace: > >=20 > > #0 0x00007f77cd347935 in raise () from /lib64/libc.so.6 > > #1 0x00007f77cd3490e8 in abort () from /lib64/libc.so.6 > > #2 0x00007f77cd3406a2 in __assert_fail_base () from /lib64/libc.so.6 > > #3 0x00007f77cd340752 in __assert_fail () from /lib64/libc.so.6 > > #4 0x00007f77d1c1f226 in monitor_puts (mon=3D, > > str=3D) at=20 >=20 > Yes, it's easy to reproduce. Bisect says: >=20 > f628926bb423fa8a7e0b114511400ea9df38b76a is the first bad commit > commit f628926bb423fa8a7e0b114511400ea9df38b76a > Author: Gerd Hoffmann > Date: Tue Mar 19 10:57:56 2013 +0100 >=20 > fix monitor > =20 > chardev flow control broke monitor, fix it by adding watch support. > =20 > Signed-off-by: Anthony Liguori >=20 > My impression is that monitor_puts() in being called in parallel. Not all. What's happening is that qemu_chr_fe_write() is returning < 0, mon->outbuf_index is not reset and is full, this causes the assert in monitor_puts() to trig. The previous version of monitor_flush() ignores errors, and everything works, so doing the same thing here fixes the problem :) For some reason I'm unable to see what the error code is. Gerd, do you think the patch below is reasonable? If it's not, how should we handle errors her= e? diff --git a/monitor.c b/monitor.c index cfb5d64..ecfe97c 100644 --- a/monitor.c +++ b/monitor.c @@ -274,12 +274,11 @@ void monitor_flush(Monitor *mon) =20 if (mon && mon->outbuf_index !=3D 0 && !mon->mux_out) { rc =3D qemu_chr_fe_write(mon->chr, mon->outbuf, mon->outbuf_index); - if (rc =3D=3D mon->outbuf_index) { + if (rc =3D=3D mon->outbuf_index || rc < 0) { /* all flushed */ mon->outbuf_index =3D 0; return; - } - if (rc > 0) { + } else { /* partinal write */ memmove(mon->outbuf, mon->outbuf + rc, mon->outbuf_index - rc); mon->outbuf_index -=3D rc;