From: "Daniel P. Berrange" <berrange@redhat.com>
To: "Gaohaifeng (A)" <gaohaifeng.gao@huawei.com>
Cc: "pbonzini@redhat.com" <pbonzini@redhat.com>,
"Lilijun (Jerry)" <jerry.lilijun@huawei.com>,
zangchuanqiang <zangchuanqiang@huawei.com>,
qemu devel <qemu-devel@nongnu.org>
Subject: Re: [Qemu-devel] A question about this commit 9894dc0cdcc397ee5b26370bc53da6d360a363c2
Date: Thu, 25 Aug 2016 11:52:27 -0400 [thread overview]
Message-ID: <20160825155227.GB23814@redhat.com> (raw)
In-Reply-To: <47CEA9C0E570484FBF22EF0D7EBCE5B5357D9022@szxema505-mbx.china.huawei.com>
On Thu, Aug 25, 2016 at 01:21:47PM +0000, Gaohaifeng (A) wrote:
> On Tue, Aug 23, 2016 at 08:57:44AM +0000, Gaohaifeng (A) wrote:
> > Hi Daniel & Paolo,
> > >
> > > Commit 9894dc0c "char: convert from GIOChannel to QIOChannel", about
> > >
> > > the below code segment:
> > >
> > > -static gboolean tcp_chr_read(GIOChannel *chan, GIOCondition cond,
> > > void *opaque)
> > > +static gboolean tcp_chr_read(QIOChannel *chan, GIOCondition cond,
> > > +void *opaque)
> > > {
> > > CharDriverState *chr = opaque;
> > > TCPCharDriver *s = chr->opaque;
> > > @@ -2938,9 +2801,7 @@ static gboolean tcp_chr_read(GIOChannel *chan, GIOCondition cond, void *opaque)
> > > if (len > s->max_size)
> > > len = s->max_size;
> > > size = tcp_chr_recv(chr, (void *)buf, len);
> > > - if (size == 0 ||
> > > - (size < 0 &&
> > > - socket_error() != EAGAIN && socket_error() != EWOULDBLOCK)) {
> > > + if (size == 0 || size == -1) {
> > > /* connection closed */
> > > tcp_chr_disconnect(chr);
> > > } else if (size > 0) {
> > >
> > > The old version will not call tcp_chr_disconnect when error is not
> > > EAGAIN.
> > > The new version will call tcp_chr_disconnect when size==-1. From the
> > > tcp_chr_recv function we see EAGIN will return -1, so EAGIN will call
> > > tcp_chr_disconnect.
> > >
> > > We meet an issue when Guest VM use DPDK(1.6.0) l2fwd, it may exit
> > > since link status is not up. The link is down because
> > > tcp_chr_disconnect will set it.
> > >
> > > Can you explain it why EAGIN here changes the behavior ?
>
> > tcp_chr_read is used in a call to io_add_watch_poll() which sets up an event loop watch so that tcp_chr_read is called when there is incoming data to be read. IOW, when tcp_chr_read is invoked, I would always expect that at least 1 byte is available, and so EAGAIN should not occurr.
> >
> > So I'm puzzled why you would get EAGAIN at all, unless there is some kind of race with other code also consuming bytes from the same socket.
>
> It could easily reproduce this when repeating 'rmmod virtio_net;modprobe virtio_net'.
> The call stack:
> #0 tcp_chr_disconnect (chr=0x7f09d5b2d540) at qemu-char.c:2867
> #1 0x00007f09d43b2106 in tcp_chr_read (chan=0x7f09d5b2df30, cond=G_IO_IN, opaque=0x7f09d5b2d540) at qemu-char.c:2917
> #2 0x00007f09d46364fa in qio_channel_fd_source_dispatch (source=0x7f093e603b30, callback=0x7f09d43b1fcb <tcp_chr_read>, user_data=0x7f09d5b2d540) at io/channel-watch.c:84
> #3 0x00007f09d2e9999a in g_main_context_dispatch () from /lib64/libglib-2.0.so.0
> #4 0x00007f09d45bfc96 in glib_pollfds_poll () at main-loop.c:213
> #5 0x00007f09d45bfd73 in os_host_main_loop_wait (timeout=991853) at main-loop.c:258
> #6 0x00007f09d45bfe23 in main_loop_wait (nonblocking=0) at main-loop.c:506
> #7 0x00007f09d43bbdd2 in main_loop () at vl.c:2119
> #8 0x00007f09d43c380e in main (argc=56, argv=0x7ffe7ad669a8, envp=0x7ffe7ad66b70) at vl.c:4896
> (gdb) f 1
> #1 0x00007f09d43b2106 in tcp_chr_read (chan=0x7f09d5b2df30, cond=G_IO_IN, opaque=0x7f09d5b2d540) at qemu-char.c:2917
> 2917 tcp_chr_disconnect(chr);
> (gdb) p errno
> $1 = 11
>
> EAGAIN = 11
I think what is happening is a race condition - the main event loop thread
sees I/O incoming triggering tcp_chr_read. Meanwhile the vhost-user.c file is
triggering vhost_user_read() in a different thread which consumes the incoming
I/O before tcp_chr_read can handle it.
IOW, you are right, we should have kept the EAGAIN handling code in tcp_chr_read
to avoid the disconnects.
Regards,
Daniel
--
|: http://berrange.com -o- http://www.flickr.com/photos/dberrange/ :|
|: http://libvirt.org -o- http://virt-manager.org :|
|: http://autobuild.org -o- http://search.cpan.org/~danberr/ :|
|: http://entangle-photo.org -o- http://live.gnome.org/gtk-vnc :|
prev parent reply other threads:[~2016-08-25 15:52 UTC|newest]
Thread overview: 4+ messages / expand[flat|nested] mbox.gz Atom feed top
2016-08-23 8:57 [Qemu-devel] A question about this commit 9894dc0cdcc397ee5b26370bc53da6d360a363c2 Gaohaifeng (A)
2016-08-25 1:33 ` Daniel P. Berrange
2016-08-25 13:21 ` Gaohaifeng (A)
2016-08-25 15:52 ` Daniel P. Berrange [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20160825155227.GB23814@redhat.com \
--to=berrange@redhat.com \
--cc=gaohaifeng.gao@huawei.com \
--cc=jerry.lilijun@huawei.com \
--cc=pbonzini@redhat.com \
--cc=qemu-devel@nongnu.org \
--cc=zangchuanqiang@huawei.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).