qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
From: "Daniel P. Berrangé" <berrange@redhat.com>
To: "manish.mishra" <manish.mishra@nutanix.com>
Cc: qemu-devel@nongnu.org, peterx@redhat.com,
	prerna.saxena@nutanix.com, quintela@redhat.com,
	dgilbert@redhat.com, lsoaresp@redhat.com
Subject: Re: [PATCH v3 1/2] io: Add support for MSG_PEEK for socket channel
Date: Tue, 22 Nov 2022 10:31:59 +0000	[thread overview]
Message-ID: <Y3ylH0J7rl5o5KrI@redhat.com> (raw)
In-Reply-To: <f03d5744-8369-ed73-49f8-9a53a9507afb@nutanix.com>

On Tue, Nov 22, 2022 at 03:43:55PM +0530, manish.mishra wrote:
> 
> On 22/11/22 3:23 pm, Daniel P. Berrangé wrote:
> > On Tue, Nov 22, 2022 at 03:10:53PM +0530, manish.mishra wrote:
> > > On 22/11/22 2:59 pm, Daniel P. Berrangé wrote:
> > > > On Tue, Nov 22, 2022 at 02:38:53PM +0530, manish.mishra wrote:
> > > > > On 22/11/22 2:30 pm, Daniel P. Berrangé wrote:
> > > > > > On Sat, Nov 19, 2022 at 09:36:14AM +0000, manish.mishra wrote:
> > > > > > > MSG_PEEK reads from the peek of channel, The data is treated as
> > > > > > > unread and the next read shall still return this data. This
> > > > > > > support is currently added only for socket class. Extra parameter
> > > > > > > 'flags' is added to io_readv calls to pass extra read flags like
> > > > > > > MSG_PEEK.
> > > > > > > 
> > > > > > > Suggested-by: Daniel P. Berrangé <berrange@redhat.com
> > > > > > > Signed-off-by: manish.mishra <manish.mishra@nutanix.com>
> > > > > > > ---
> > > > > > >     chardev/char-socket.c               |  4 +-
> > > > > > >     include/io/channel.h                | 83 +++++++++++++++++++++++++++++
> > > > > > >     io/channel-buffer.c                 |  1 +
> > > > > > >     io/channel-command.c                |  1 +
> > > > > > >     io/channel-file.c                   |  1 +
> > > > > > >     io/channel-null.c                   |  1 +
> > > > > > >     io/channel-socket.c                 | 16 +++++-
> > > > > > >     io/channel-tls.c                    |  1 +
> > > > > > >     io/channel-websock.c                |  1 +
> > > > > > >     io/channel.c                        | 73 +++++++++++++++++++++++--
> > > > > > >     migration/channel-block.c           |  1 +
> > > > > > >     scsi/qemu-pr-helper.c               |  2 +-
> > > > > > >     tests/qtest/tpm-emu.c               |  2 +-
> > > > > > >     tests/unit/test-io-channel-socket.c |  1 +
> > > > > > >     util/vhost-user-server.c            |  2 +-
> > > > > > >     15 files changed, 179 insertions(+), 11 deletions(-)
> > > > > > > diff --git a/io/channel-socket.c b/io/channel-socket.c
> > > > > > > index b76dca9cc1..a06b24766d 100644
> > > > > > > --- a/io/channel-socket.c
> > > > > > > +++ b/io/channel-socket.c
> > > > > > > @@ -406,6 +406,8 @@ qio_channel_socket_accept(QIOChannelSocket *ioc,
> > > > > > >         }
> > > > > > >     #endif /* WIN32 */
> > > > > > > +    qio_channel_set_feature(QIO_CHANNEL(cioc), QIO_CHANNEL_FEATURE_READ_MSG_PEEK);
> > > > > > > +
> > > > > > This covers the incoming server side socket.
> > > > > > 
> > > > > > This also needs to be set in outgoing client side socket in
> > > > > > qio_channel_socket_connect_async
> > > > > Yes sorry, i considered only current use-case, but as it is generic one both should be there. Thanks will update it.
> > > > > 
> > > > > > > @@ -705,7 +718,6 @@ static ssize_t qio_channel_socket_writev(QIOChannel *ioc,
> > > > > > >     }
> > > > > > >     #endif /* WIN32 */
> > > > > > > -
> > > > > > >     #ifdef QEMU_MSG_ZEROCOPY
> > > > > > >     static int qio_channel_socket_flush(QIOChannel *ioc,
> > > > > > >                                         Error **errp)
> > > > > > Please remove this unrelated whitespace change.
> > > > > > 
> > > > > > 
> > > > > > > @@ -109,6 +117,37 @@ int qio_channel_readv_all_eof(QIOChannel *ioc,
> > > > > > >         return qio_channel_readv_full_all_eof(ioc, iov, niov, NULL, NULL, errp);
> > > > > > >     }
> > > > > > > +int qio_channel_readv_peek_all_eof(QIOChannel *ioc,
> > > > > > > +                                   const struct iovec *iov,
> > > > > > > +                                   size_t niov,
> > > > > > > +                                   Error **errp)
> > > > > > > +{
> > > > > > > +   ssize_t len = 0;
> > > > > > > +   ssize_t total = iov_size(iov, niov);
> > > > > > > +
> > > > > > > +   while (len < total) {
> > > > > > > +       len = qio_channel_readv_full(ioc, iov, niov, NULL,
> > > > > > > +                                    NULL, QIO_CHANNEL_READ_FLAG_MSG_PEEK, errp);
> > > > > > > +
> > > > > > > +       if (len == QIO_CHANNEL_ERR_BLOCK) {
> > > > > > > +            if (qemu_in_coroutine()) {
> > > > > > > +                qio_channel_yield(ioc, G_IO_IN);
> > > > > > > +            } else {
> > > > > > > +                qio_channel_wait(ioc, G_IO_IN);
> > > > > > > +            }
> > > > > > > +            continue;
> > > > > > > +       }
> > > > > > > +       if (len == 0) {
> > > > > > > +           return 0;
> > > > > > > +       }
> > > > > > > +       if (len < 0) {
> > > > > > > +           return -1;
> > > > > > > +       }
> > > > > > > +   }
> > > > > > This will busy wait burning CPU where there is a read > 0 and < total.
> > > > > > 
> > > > > Daniel, i could use MSG_WAITALL too if that works but then we will
> > > > > lose opportunity to yield. Or if you have some other idea.
> > > > I fear this is an inherant problem with the idea of using PEEK to
> > > > look at the magic data.
> > > > 
> > > > If we actually read the magic bytes off the wire, then we could have
> > > > the same code path for TLS and non-TLS. We would have to modify the
> > > > existing later code paths though to take account of fact that the
> > > > magic was already read by an earlier codepath.
> > > > 
> > > > With regards,
> > > > Daniel
> > > 
> > > sure Daniel, I am happy to drop use of MSG_PEEK, but that way also we
> > > have issue with tls for reason we discussed in V2. Is it okay to send
> > > a patch with actual read ahead but not for tls case? tls anyway does
> > > not have this bug as it does handshake.
> > I've re-read the previous threads, but I don't see what the problem
> > with TLS is.  We already decided that TLS is not affected by the
> > race condition. So there should be no problem in reading the magic
> > bytes early on the TLS channels, while reading the bytes early on
> > a non-TLS channel will fix the race condition.
> 
> 
> Actually with tls all channels requires handshake to be assumed established,
> and from source side we do initial qemu_flush only when all channels are
> established. But on destination side we will stuck on reading magic for
> main channel itself which never comes because source has not flushed data,
> so no new connections can be established(e.g. multiFD). So basically
> destination can not accept any new channel until we read from main
> channel and source is not putting any data on main channel until all
> channels are established. So if we read ahread in
> ioc_process_incoming_channel there is this deadlock with tls. This issue
> is not there with non-tls case, because there on source side we assume a
> connection established once connect() call is successful.

Ah yes, I forgot about the 'flush' problem. Reading magic in non-TLS case
is OK then i guess.

With regards,
Daniel
-- 
|: https://berrange.com      -o-    https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org         -o-            https://fstop138.berrange.com :|
|: https://entangle-photo.org    -o-    https://www.instagram.com/dberrange :|



  reply	other threads:[~2022-11-22 10:33 UTC|newest]

Thread overview: 24+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-11-19  9:36 [PATCH 1/2] io: Add support for MSG_PEEK for socket channel manish.mishra
2022-11-19  9:36 ` [PATCH 2/2] migration: check magic value for deciding the mapping of channels manish.mishra
2022-11-19  9:36 ` manish.mishra
2022-11-19  9:36 ` [PATCH v3 1/2] io: Add support for MSG_PEEK for socket channel manish.mishra
2022-11-22  9:00   ` Daniel P. Berrangé
2022-11-22  9:08     ` manish.mishra
2022-11-22  9:29       ` Daniel P. Berrangé
2022-11-22  9:40         ` manish.mishra
2022-11-22  9:53           ` Daniel P. Berrangé
2022-11-22 10:13             ` manish.mishra
2022-11-22 10:31               ` Daniel P. Berrangé [this message]
2022-11-22 14:41       ` Peter Xu
2022-11-22 14:49         ` Daniel P. Berrangé
2022-11-22 15:31           ` manish.mishra
2022-11-22 16:10             ` Peter Xu
2022-11-22 16:29               ` Peter Xu
2022-11-22 16:33                 ` Peter Xu
2022-11-22 16:42                   ` manish.mishra
2022-11-22 17:16                     ` Peter Xu
2022-11-22 17:31                       ` Daniel P. Berrangé
2022-11-19  9:36 ` [PATCH v3 2/2] migration: check magic value for deciding the mapping of channels manish.mishra
2022-11-21 21:59   ` Peter Xu
2022-11-22  9:01   ` Daniel P. Berrangé
2022-11-19  9:40 ` [PATCH 1/2] io: Add support for MSG_PEEK for socket channel manish.mishra

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=Y3ylH0J7rl5o5KrI@redhat.com \
    --to=berrange@redhat.com \
    --cc=dgilbert@redhat.com \
    --cc=lsoaresp@redhat.com \
    --cc=manish.mishra@nutanix.com \
    --cc=peterx@redhat.com \
    --cc=prerna.saxena@nutanix.com \
    --cc=qemu-devel@nongnu.org \
    --cc=quintela@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).