From: "Daniel P. Berrangé" <berrange@redhat.com>
To: Greg Kurz <groug@kaod.org>
Cc: Vivek Goyal <vgoyal@redhat.com>,
"Dr. David Alan Gilbert" <dgilbert@redhat.com>,
qemu-devel@nongnu.org, Stefan Hajnoczi <stefanha@redhat.com>,
"Michael S. Tsirkin" <mst@redhat.com>
Subject: Re: [PATCH 2/4] vhost-user: Convert slave channel to QIOChannelSocket
Date: Wed, 10 Mar 2021 11:43:58 +0000 [thread overview]
Message-ID: <YEiw/v3vhB7y6ve3@redhat.com> (raw)
In-Reply-To: <20210309212322.3ab5809d@bahia.lan>
On Tue, Mar 09, 2021 at 09:23:22PM +0100, Greg Kurz wrote:
> On Tue, 9 Mar 2021 15:17:21 +0000
> Stefan Hajnoczi <stefanha@redhat.com> wrote:
>
> > On Mon, Mar 08, 2021 at 01:31:39PM +0100, Greg Kurz wrote:
> > > + g_autofree int *fd = NULL;
> > > + size_t fdsize = 0;
> > > + int i;
> > >
> > > /* Read header */
> > > iov.iov_base = &hdr;
> > > iov.iov_len = VHOST_USER_HDR_SIZE;
> > >
> > > do {
> > > - size = recvmsg(u->slave_fd, &msgh, 0);
> > > - } while (size < 0 && (errno == EINTR || errno == EAGAIN));
> > > + size = qio_channel_readv_full(ioc, &iov, 1, &fd, &fdsize, NULL);
> > > + } while (size == QIO_CHANNEL_ERR_BLOCK);
> >
> > Is it possible to leak file descriptors and fd[] memory if we receive a
> > short message and then loop around? qio_channel_readv_full() will
> > overwrite &fd and that's how the leak occurs.
> >
>
> qio_channel_readv_full() only returns QIO_CHANNEL_ERR_BLOCK when the
> channel is non-blocking mode and no data is available. Under the hood,
> this translates to this code in qio_channel_socket_readv():
>
> retry:
> ret = recvmsg(sioc->fd, &msg, sflags);
> if (ret < 0) {
> if (errno == EAGAIN) {
> return QIO_CHANNEL_ERR_BLOCK;
> }
> if (errno == EINTR) {
> goto retry;
> }
>
> error_setg_errno(errp, errno,
> "Unable to read from socket");
> return -1;
> }
>
> This is strictly equivalent to what we currently have. This cannot
> leak file descriptors because we would only loop until the first
> byte of real data is available and ancillary data cannot fly without
> real data, i.e. no file descriptors when recvmsg() returns EAGAIN.
Yep, EAGAIN will only happen if you have no data AND no FDs.
>
> > On the other hand, it looks like ioc is in blocking mode. I'm not sure
> > QIO_CHANNEL_ERR_BLOCK can occur?
> >
>
> You're right but the existing code is ready to handle the non-blocking
> case, so I just kept this behavior.
The existing code is *not* handling the non-blocking case in any
useful way IMHO. It will block execution of this QEMU thread, and
sit in a tight loop burning CPU in the EAGAIN case.
Handling non-blocking means using an I/O watch with the event loop
to be notified when something becomes ready.
I notice the code that follows is also doing
if (size != VHOST_USER_HDR_SIZE) {
error_report("Failed to read from slave.");
goto err;
}
which is also dubious because it assumes the previous recvmsg is
guaranteed to return VHOST_USER_HDR_SIZE bytes of data in a single
call.
It does at least look like the code is intentionally blocking
execution fo the QEMU thread until the expected amount of I/O
is receieved.
Given this, you should remove the while loop entirely and
just call
qio_channel_readv_full_all()
this will block execution until *all* the requestd data bytes
are read. It will re-try the recvmsg in the event of a partial
data read, and will intelligently wait in poll() on EAGAIN.
>
> > > @@ -1500,8 +1479,8 @@ static void slave_read(void *opaque)
> > >
> > > /* Read payload */
> > > do {
> > > - size = read(u->slave_fd, &payload, hdr.size);
> > > - } while (size < 0 && (errno == EINTR || errno == EAGAIN));
> > > + size = qio_channel_read(ioc, (char *) &payload, hdr.size, NULL);
> > > + } while (size == QIO_CHANNEL_ERR_BLOCK);
> >
> > Same question here.
>
> This also ends up in qio_channel_socket_readv().
Regards,
Daniel
--
|: https://berrange.com -o- https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org -o- https://fstop138.berrange.com :|
|: https://entangle-photo.org -o- https://www.instagram.com/dberrange :|
next prev parent reply other threads:[~2021-03-10 11:50 UTC|newest]
Thread overview: 18+ messages / expand[flat|nested] mbox.gz Atom feed top
2021-03-08 12:31 [PATCH 0/4] virtiofsd: Avoid potential deadlocks Greg Kurz
2021-03-08 12:31 ` [PATCH 1/4] vhost-user: Introduce nested event loop in vhost_user_read() Greg Kurz
2021-03-09 15:02 ` Stefan Hajnoczi
2021-03-09 18:35 ` Greg Kurz
2021-03-08 12:31 ` [PATCH 2/4] vhost-user: Convert slave channel to QIOChannelSocket Greg Kurz
2021-03-09 15:17 ` Stefan Hajnoczi
2021-03-09 20:23 ` Greg Kurz
2021-03-10 11:27 ` Stefan Hajnoczi
2021-03-10 13:08 ` Greg Kurz
2021-03-10 11:43 ` Daniel P. Berrangé [this message]
2021-03-10 13:45 ` Greg Kurz
2021-03-10 13:48 ` Daniel P. Berrangé
2021-03-08 12:31 ` [PATCH 3/4] vhost-user: Monitor slave channel in vhost_user_read() Greg Kurz
2021-03-09 15:18 ` Stefan Hajnoczi
2021-03-09 22:56 ` Greg Kurz
2021-03-08 12:31 ` [PATCH 4/4] virtiofsd: Release vu_dispatch_lock when stopping queue Greg Kurz
2021-03-09 14:00 ` Vivek Goyal
2021-03-09 15:20 ` Stefan Hajnoczi
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=YEiw/v3vhB7y6ve3@redhat.com \
--to=berrange@redhat.com \
--cc=dgilbert@redhat.com \
--cc=groug@kaod.org \
--cc=mst@redhat.com \
--cc=qemu-devel@nongnu.org \
--cc=stefanha@redhat.com \
--cc=vgoyal@redhat.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).