qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
* [Qemu-devel] Possible bug in monitor code
@ 2014-01-22 15:53 Stratos Psomadakis
  2014-01-23  3:07 ` Fam Zheng
  2014-01-24 23:48 ` Luiz Capitulino
  0 siblings, 2 replies; 15+ messages in thread
From: Stratos Psomadakis @ 2014-01-22 15:53 UTC (permalink / raw)
  To: qemu-devel
  Cc: Synnefo Development, Stratos Psomadakis, Ganeti Development,
	Dimitris Aragiorgis


[-- Attachment #1.1: Type: text/plain, Size: 3055 bytes --]

Hi,

we've encountered a weird issue regarding monitor (qmp and hmp) behavior
with qemu-1.7 (and qemu-1.5). The following steps will reproduce the issue:

    1) Client A connects to qmp socket with socat
    2) Client A gets greeting message {"QMP": {"version": ..}
    3) Client A waits (select on the socket's fd)
    4) Client B tries to connect to the *same* qmp socket with socat
    5) Client B does *NOT* get any greating message
    6) Client B waits (select on the socket's fd)
    7) Client B closes connection (kill socat)
    8) Client A quits too
    9) Client C connects to qmp socket
    10) Client C gets *two* greeting messages!!!

After some investigation, we traced it down to the monitor_flush()
function in monitor.c. Specifically, when a second client connects to
the qmp (client B), while another one is already using it (client A), we
get the following from stracing the second client (client B):

    connect(3, {sa_family=AF_FILE, path="foo.mon"}, 9) = 0
    getsockname(3, {sa_family=AF_FILE, NULL}, [2]) = 0
    select(4, [0 3], [1 3], [], NULL)       = 2 (out [1 3])
    select(4, [0 3], [], [], NULL

So, the connect() syscall from client B succeeds, although client B
connection has not yet been accepted by the qmp server (it's still in
the backlog of the qmp listening socket).

After killing client B and then client A, we see the following when
stracing the qemu proc:

    22363 accept4(6, {sa_family=AF_FILE, NULL}, [2], SOCK_CLOEXEC) = 9
    22363 fcntl(9, F_GETFL)                 = 0x2 (flags O_RDWR)
    22363 fcntl(9, F_SETFL, O_RDWR|O_NONBLOCK) = 0
    22363 fstat(9, {st_mode=S_IFSOCK|0777, st_size=0, ...}) = 0
    22363 fcntl(9, F_GETFL)                 = 0x802 (flags
    O_RDWR|O_NONBLOCK)
    22363 write(9, "{\"QMP\": {\"version\": {\"qemu\": {\"m"..., 127) =
    -1 EPIPE (Broken pipe)
    22363 --- SIGPIPE (Broken pipe) @ 0 (0) ---

The qmp server / qemu accepts the connection from client B (who has now
closed the connection) and tries to write the greeting message to the
socket fd. This results in write returning an error (EPIPE).

The monitor_flush() function doesn't seem to handle this case (write
error). Instead, it adds a watch / handler to retry the write operation.
Thus, mon->outbuf is not cleaned up properly, which results in duplicate
greeting messages for the next client to connect.

The following seems to do the trick.

diff --git a/monitor.c b/monitor.c
index 845f608..5622f20 100644
--- a/monitor.c
+++ b/monitor.c
@@ -288,8 +288,8 @@ void monitor_flush(Monitor *mon)
 
     if (len && !mon->mux_out) {
         rc = qemu_chr_fe_write(mon->chr, (const uint8_t *) buf, len);
-        if (rc == len) {
-            /* all flushed */
+        if ((rc < 0 && errno != EAGAIN) || (rc == len)) {
+            /* all flushed or error */
             QDECREF(mon->outbuf);
             mon->outbuf = qstring_new();
             return;

Comments?

Thanks,
Stratos

-- 
Stratos Psomadakis
<psomas@grnet.gr>


[-- Attachment #1.2: Type: text/html, Size: 4264 bytes --]

[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 901 bytes --]

^ permalink raw reply related	[flat|nested] 15+ messages in thread
* Re: [Qemu-devel] Possible bug in monitor code
@ 2014-01-23 11:23 Fam Zheng
  2014-01-23 13:44 ` Luiz Capitulino
  0 siblings, 1 reply; 15+ messages in thread
From: Fam Zheng @ 2014-01-23 11:23 UTC (permalink / raw)
  To: Stratos Psomadakis
  Cc: Synnefo Development, Ganeti Development, qemu-devel,
	Dimitris Aragiorgis, Luiz Capitulino

Bcc: 
Subject: Re: [Qemu-devel] Possible bug in monitor code
Reply-To: 
In-Reply-To: <52E0EC4B.7010603@grnet.gr>

On Thu, 01/23 12:17, Stratos Psomadakis wrote:
> On 01/23/2014 05:07 AM, Fam Zheng wrote:
> > On Wed, 01/22 17:53, Stratos Psomadakis wrote:
> >> Hi,
> >>
> >> we've encountered a weird issue regarding monitor (qmp and hmp) behavior
> >> with qemu-1.7 (and qemu-1.5). The following steps will reproduce the issue:
> >>
> >>     1) Client A connects to qmp socket with socat
> >>     2) Client A gets greeting message {"QMP": {"version": ..}
> >>     3) Client A waits (select on the socket's fd)
> >>     4) Client B tries to connect to the *same* qmp socket with socat
> >>     5) Client B does *NOT* get any greating message
> >>     6) Client B waits (select on the socket's fd)
> >>     7) Client B closes connection (kill socat)
> >>     8) Client A quits too
> >>     9) Client C connects to qmp socket
> >>     10) Client C gets *two* greeting messages!!!
> > Hi Stratos, thank you for debugging and reporting this.
> >
> > I tested this sequence but can't fully reproduce this. What I see is 5) but no
> > 10). Client C acts normally. And your patch below doesn't solve it for me.
> 
> Hm, which qemu version (or repo branch / tag) did you use? We did a
> quick scan of the master branch code / commits, but we didn't find
> anything that might fix the issue.

I tried on qemu.git master, and also 1.7. I think it is a bug: in my test, step
5), B not getting any greeting message.

But I get only one greeting message in step 10), which is a bit different from
what you reported.

And no difference with your patch applied.

Cc'ing Luiz who maintains the monitor code and may have more ideas.

Thanks,

Fam

> 
> > To submit a patch, please follow instructions as described in
> > http://wiki.qemu.org/Contribute/SubmitAPatch
> > so it could be picked up by maintainers. Specifically, you need to format your
> > patch email with "git format-patch" and add a "Signed-off-by:" line in your
> > patch email.
> 
> Ok. If any dev can confirm that this is a bug (and that the patch below
> is the correct way to fix it) I'll resubmit it properly.
> 
> Thanks,
> Stratos
> 
> > Thanks,
> >
> > Fam
> >
> >> After some investigation, we traced it down to the monitor_flush()
> >> function in monitor.c. Specifically, when a second client connects to
> >> the qmp (client B), while another one is already using it (client A), we
> >> get the following from stracing the second client (client B):
> >>
> >>     connect(3, {sa_family=AF_FILE, path="foo.mon"}, 9) = 0
> >>     getsockname(3, {sa_family=AF_FILE, NULL}, [2]) = 0
> >>     select(4, [0 3], [1 3], [], NULL)       = 2 (out [1 3])
> >>     select(4, [0 3], [], [], NULL
> >>
> >> So, the connect() syscall from client B succeeds, although client B
> >> connection has not yet been accepted by the qmp server (it's still in
> >> the backlog of the qmp listening socket).
> >>
> >> After killing client B and then client A, we see the following when
> >> stracing the qemu proc:
> >>
> >>     22363 accept4(6, {sa_family=AF_FILE, NULL}, [2], SOCK_CLOEXEC) = 9
> >>     22363 fcntl(9, F_GETFL)                 = 0x2 (flags O_RDWR)
> >>     22363 fcntl(9, F_SETFL, O_RDWR|O_NONBLOCK) = 0
> >>     22363 fstat(9, {st_mode=S_IFSOCK|0777, st_size=0, ...}) = 0
> >>     22363 fcntl(9, F_GETFL)                 = 0x802 (flags
> >>     O_RDWR|O_NONBLOCK)
> >>     22363 write(9, "{\"QMP\": {\"version\": {\"qemu\": {\"m"..., 127) =
> >>     -1 EPIPE (Broken pipe)
> >>     22363 --- SIGPIPE (Broken pipe) @ 0 (0) ---
> >>
> >> The qmp server / qemu accepts the connection from client B (who has now
> >> closed the connection) and tries to write the greeting message to the
> >> socket fd. This results in write returning an error (EPIPE).
> >>
> >> The monitor_flush() function doesn't seem to handle this case (write
> >> error). Instead, it adds a watch / handler to retry the write operation.
> >> Thus, mon->outbuf is not cleaned up properly, which results in duplicate
> >> greeting messages for the next client to connect.
> >>
> >> The following seems to do the trick.
> >>
> >> diff --git a/monitor.c b/monitor.c
> >> index 845f608..5622f20 100644
> >> --- a/monitor.c
> >> +++ b/monitor.c
> >> @@ -288,8 +288,8 @@ void monitor_flush(Monitor *mon)
> >>  
> >>      if (len && !mon->mux_out) {
> >>          rc = qemu_chr_fe_write(mon->chr, (const uint8_t *) buf, len);
> >> -        if (rc == len) {
> >> -            /* all flushed */
> >> +        if ((rc < 0 && errno != EAGAIN) || (rc == len)) {
> >> +            /* all flushed or error */
> >>              QDECREF(mon->outbuf);
> >>              mon->outbuf = qstring_new();
> >>              return;
> >>
> >> Comments?
> >>
> >> Thanks,
> >> Stratos
> >>
> >> -- 
> >> Stratos Psomadakis
> >> <psomas@grnet.gr>
> >>
> >
> 
> 
> -- 
> Stratos Psomadakis
> <psomas@grnet.gr>
> 
> 

^ permalink raw reply	[flat|nested] 15+ messages in thread

end of thread, other threads:[~2014-01-29 14:13 UTC | newest]

Thread overview: 15+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2014-01-22 15:53 [Qemu-devel] Possible bug in monitor code Stratos Psomadakis
2014-01-23  3:07 ` Fam Zheng
2014-01-23 10:17   ` Stratos Psomadakis
2014-01-24 23:48 ` Luiz Capitulino
2014-01-27 10:30   ` [Qemu-devel] [PATCH] monitor: Cleanup mon->outbuf on write error Stratos Psomadakis
2014-01-29 10:46     ` Stratos Psomadakis
2014-01-29 14:12       ` Luiz Capitulino
  -- strict thread matches above, loose matches on Subject: below --
2014-01-23 11:23 [Qemu-devel] Possible bug in monitor code Fam Zheng
2014-01-23 13:44 ` Luiz Capitulino
2014-01-23 13:54   ` Luiz Capitulino
2014-01-23 15:33     ` Stratos Psomadakis
2014-01-23 18:28       ` Luiz Capitulino
2014-01-24 10:14         ` Stratos Psomadakis
2014-01-24 10:52           ` Apollon Oikonomopoulos
2014-01-24 14:14           ` Luiz Capitulino

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).