qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
From: "Daniel P. Berrangé" <berrange@redhat.com>
To: 858585 jemmy <jemmy858585@gmail.com>
Cc: "Dr. David Alan Gilbert" <dgilbert@redhat.com>,
	Juan Quintela <quintela@redhat.com>,
	qemu-devel <qemu-devel@nongnu.org>,
	Gal Shachaf <galsha@mellanox.com>,
	Aviad Yehezkel <aviadye@mellanox.com>,
	licq@mellanox.com, adido@mellanox.com,
	Lidong Chen <lidongchen@tencent.com>
Subject: Re: [Qemu-devel] [PATCH v2 4/5] migration: implement bi-directional RDMA QIOChannel
Date: Fri, 27 Apr 2018 10:16:03 +0100	[thread overview]
Message-ID: <20180427091603.GD32172@redhat.com> (raw)
In-Reply-To: <CAOGPPbcPb4F2m_o4jPn2ELRK19=Yr=P+QRi4r_-eVhzmV0d5kA@mail.gmail.com>

On Fri, Apr 27, 2018 at 03:56:38PM +0800, 858585 jemmy wrote:
> On Fri, Apr 27, 2018 at 1:36 AM, Dr. David Alan Gilbert
> <dgilbert@redhat.com> wrote:
> > * Lidong Chen (jemmy858585@gmail.com) wrote:
> >> This patch implements bi-directional RDMA QIOChannel. Because different
> >> threads may access RDMAQIOChannel concurrently, this patch use RCU to protect it.
> >>
> >> Signed-off-by: Lidong Chen <lidongchen@tencent.com>
> >
> > I'm a bit confused by this.
> >
> > I can see it's adding RCU to protect the rdma structures against
> > deletion from multiple threads; that I'm OK with in principal; is that
> > the only locking we need? (I guess the two directions are actually
> > separate RDMAContext's so maybe).
> 
> The qio_channel_rdma_close maybe invoked by migration thread and
> return path thread
> concurrently, so I use a mutex to protect it.

Hmm, that is not good - concurrent threads calling close must not be
allowed to happen even with non-RDMA I/O chanels.

For example, with the QIOChannelSocket, one thread can call close
which sets the fd = -1, another thread can race with this and either
end up calling close again on the same FD or calling close on -1.
Either way the second thread will get an error from close() when
it should have skipped the close() and returned success. Perhaps
migration gets lucky and this doesn't result in it being marked
as failed, but it is still not good.

So only one thread should be calling close().

> If one thread invoke qio_channel_rdma_writev, another thread invokes
> qio_channel_rdma_readv,
> two threads will use separate RDMAContext, so it does not need a lock.
> 
> If two threads invoke qio_channel_rdma_writev concurrently, it will
> need a lock to protect.
> but I find source qemu migration thread only invoke
> qio_channel_rdma_writev, the return path
> thread only invokes qio_channel_rdma_readv.

QIOChannel impls are only intended to cope with a single thread doing
I/O in each direction. If you have two threads needing to read, or
two threads needing to write, the layer above should provide locking
to ensure correct ordering  of I/O oprations.

> The destination qemu only invoked qio_channel_rdma_readv by main
> thread before postcopy and or
> listen thread after postcopy.
> 
> The destination qemu have already protected it by using
> qemu_mutex_lock(&mis->rp_mutex) when writing data to
> source qemu.
> 
> But should we use qemu_mutex_lock to protect qio_channel_rdma_writev
> and qio_channel_rdma_readv?
> to avoid some change in future invoke qio_channel_rdma_writev or
> qio_channel_rdma_readv concurrently?

> 
> >
> > But is there nothing else to make the QIOChannel bidirectional?
> >
> > Also, a lot seems dependent on listen_id, can you explain how that's
> > being used.
> 
> The destination qemu is server side, so listen_id is not zero. the
> source qemu is client side,
> the listen_id is zero.
> I use listen_id to determine whether qemu is destination or source.
> 
> for the destination qemu, if write data to source, it need use the
> return_path rdma, like this:
>     if (rdma->listen_id) {
>         rdma = rdma->return_path;
>     }
> 
> for the source qemu, if read data from destination, it also need use
> the return_path rdma.
>     if (!rdma->listen_id) {
>         rdma = rdma->return_path;
>     }

This feels uncessarily complex to me. Why not just change QIOCHannelRMDA
struct, so that it has 2 RDMA context pointers eg

struct QIOChannelRDMA {
    QIOChannel parent;
    RDMAContext *rdmain;
    RDMAContext *rdmaout;
    QEMUFile *file;
    bool blocking; /* XXX we don't actually honour this yet */
};



Regards,
Daniel
-- 
|: https://berrange.com      -o-    https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org         -o-            https://fstop138.berrange.com :|
|: https://entangle-photo.org    -o-    https://www.instagram.com/dberrange :|

  reply	other threads:[~2018-04-27  9:16 UTC|newest]

Thread overview: 18+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-04-25 14:35 [Qemu-devel] [PATCH v2 0/5] Enable postcopy RDMA live migration Lidong Chen
2018-04-25 14:35 ` [Qemu-devel] [PATCH v2 1/5] migration: disable RDMA WRITE after postcopy started Lidong Chen
2018-04-26 16:11   ` Dr. David Alan Gilbert
2018-04-25 14:35 ` [Qemu-devel] [PATCH v2 2/5] migration: create a dedicated connection for rdma return path Lidong Chen
2018-04-26 16:19   ` Dr. David Alan Gilbert
2018-04-25 14:35 ` [Qemu-devel] [PATCH v2 3/5] migration: remove unnecessary variables len in QIOChannelRDMA Lidong Chen
2018-04-26 16:40   ` Dr. David Alan Gilbert
2018-04-27  3:51     ` 858585 jemmy
2018-04-27  9:01     ` Daniel P. Berrangé
2018-04-27  9:04   ` Daniel P. Berrangé
2018-04-25 14:35 ` [Qemu-devel] [PATCH v2 4/5] migration: implement bi-directional RDMA QIOChannel Lidong Chen
2018-04-26 17:36   ` Dr. David Alan Gilbert
2018-04-27  7:56     ` 858585 jemmy
2018-04-27  9:16       ` Daniel P. Berrangé [this message]
2018-04-28  4:16         ` 858585 jemmy
2018-04-30  9:18           ` Daniel P. Berrangé
2018-04-25 14:35 ` [Qemu-devel] [PATCH v2 5/5] migration: Stop rdma yielding during incoming postcopy Lidong Chen
2018-04-26 17:54   ` Dr. David Alan Gilbert

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20180427091603.GD32172@redhat.com \
    --to=berrange@redhat.com \
    --cc=adido@mellanox.com \
    --cc=aviadye@mellanox.com \
    --cc=dgilbert@redhat.com \
    --cc=galsha@mellanox.com \
    --cc=jemmy858585@gmail.com \
    --cc=licq@mellanox.com \
    --cc=lidongchen@tencent.com \
    --cc=qemu-devel@nongnu.org \
    --cc=quintela@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).