From: "Dr. David Alan Gilbert" <dgilbert@redhat.com>
To: 858585 jemmy <jemmy858585@gmail.com>
Cc: zhang.zhanghailiang@huawei.com,
Juan Quintela <quintela@redhat.com>,
"Daniel P. Berrange" <berrange@redhat.com>,
Aviad Yehezkel <aviadye@mellanox.com>,
Paolo Bonzini <pbonzini@redhat.com>,
qemu-devel <qemu-devel@nongnu.org>,
Adi Dotan <adido@mellanox.com>,
Lidong Chen <lidongchen@tencent.com>
Subject: Re: [Qemu-devel] [PATCH v4 04/12] migration: avoid concurrent invoke channel_close by different threads
Date: Thu, 31 May 2018 11:52:44 +0100 [thread overview]
Message-ID: <20180531105243.GB2755@work-vm> (raw)
In-Reply-To: <CAOGPPbf9kZOPdotsX7ynUE8XfeVq=vgPJX0AbVUAB4et4T1KKQ@mail.gmail.com>
* 858585 jemmy (jemmy858585@gmail.com) wrote:
> On Wed, May 30, 2018 at 10:45 PM, Dr. David Alan Gilbert
> <dgilbert@redhat.com> wrote:
> > * Lidong Chen (jemmy858585@gmail.com) wrote:
> >> From: Lidong Chen <jemmy858585@gmail.com>
> >>
> >> The channel_close maybe invoked by different threads. For example, source
> >> qemu invokes qemu_fclose in main thread, migration thread and return path
> >> thread. Destination qemu invokes qemu_fclose in main thread, listen thread
> >> and COLO incoming thread.
> >>
> >> Add a mutex in QEMUFile struct to avoid concurrent invoke channel_close.
> >>
> >> Signed-off-by: Lidong Chen <lidongchen@tencent.com>
> >> ---
> >> migration/qemu-file.c | 5 +++++
> >> 1 file changed, 5 insertions(+)
> >>
> >> diff --git a/migration/qemu-file.c b/migration/qemu-file.c
> >> index 977b9ae..87d0f05 100644
> >> --- a/migration/qemu-file.c
> >> +++ b/migration/qemu-file.c
> >> @@ -52,6 +52,7 @@ struct QEMUFile {
> >> unsigned int iovcnt;
> >>
> >> int last_error;
> >> + QemuMutex lock;
> >
> > That could do with a comment saying what you're protecting
> >
> >> };
> >>
> >> /*
> >> @@ -96,6 +97,7 @@ QEMUFile *qemu_fopen_ops(void *opaque, const QEMUFileOps *ops)
> >>
> >> f = g_new0(QEMUFile, 1);
> >>
> >> + qemu_mutex_init(&f->lock);
> >> f->opaque = opaque;
> >> f->ops = ops;
> >> return f;
> >> @@ -328,7 +330,9 @@ int qemu_fclose(QEMUFile *f)
> >> ret = qemu_file_get_error(f);
> >>
> >> if (f->ops->close) {
> >> + qemu_mutex_lock(&f->lock);
> >> int ret2 = f->ops->close(f->opaque);
> >> + qemu_mutex_unlock(&f->lock);
> >
> > OK, and at least for the RDMA code, if it calls
> > close a 2nd time, rioc->rdma is checked so it wont actually free stuff a
> > 2nd time.
> >
> >> if (ret >= 0) {
> >> ret = ret2;
> >> }
> >> @@ -339,6 +343,7 @@ int qemu_fclose(QEMUFile *f)
> >> if (f->last_error) {
> >> ret = f->last_error;
> >> }
> >> + qemu_mutex_destroy(&f->lock);
> >> g_free(f);
> >
> > Hmm but that's not safe; if two things really do call qemu_fclose()
> > on the same structure they race here and can end up destroying the lock
> > twice, or doing f->lock after the 1st one has already g_free(f).
>
> >
> >
> > So lets go back a step.
> > I think:
> > a) There should always be a separate QEMUFile* for
> > to_src_file and from_src_file - I don't see where you open
> > the 2nd one; I don't see your implementation of
> > f->ops->get_return_path.
>
> yes, current qemu version use a separate QEMUFile* for to_src_file and
> from_src_file.
> and the two QEMUFile point to one QIOChannelRDMA.
>
> the f->ops->get_return_path is implemented by channel_output_ops or
> channel_input_ops.
Ah OK, yes that makes sense.
> > b) I *think* that while the different threads might all call
> > fclose(), I think there should only ever be one qemu_fclose
> > call for each direction on the QEMUFile.
> >
> > But now we have two problems:
> > If (a) is true then f->lock is separate on each one so
> > doesn't really protect if the two directions are closed
> > at once. (Assuming (b) is true)
>
> yes, you are right. so I should add a QemuMutex in QIOChannel structure, not
> QEMUFile structure. and qemu_mutex_destroy the QemuMutex in
> qio_channel_finalize.
OK, that sounds better.
Dave
> Thank you.
>
> >
> > If (a) is false and we actually share a single QEMUFile then
> > that race at the end happens.
> >
> > Dave
> >
> >
> >> trace_qemu_file_fclose();
> >> return ret;
> >> --
> >> 1.8.3.1
> >>
> > --
> > Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK
--
Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK
next prev parent reply other threads:[~2018-05-31 10:52 UTC|newest]
Thread overview: 28+ messages / expand[flat|nested] mbox.gz Atom feed top
2018-05-30 9:43 [Qemu-devel] [PATCH v4 00/12] Enable postcopy RDMA live migration Lidong Chen
2018-05-30 9:43 ` [Qemu-devel] [PATCH v4 01/12] migration: disable RDMA WRITE after postcopy started Lidong Chen
2018-05-30 9:43 ` [Qemu-devel] [PATCH v4 02/12] migration: create a dedicated connection for rdma return path Lidong Chen
2018-05-30 9:43 ` [Qemu-devel] [PATCH v4 03/12] migration: remove unnecessary variables len in QIOChannelRDMA Lidong Chen
2018-05-30 9:43 ` [Qemu-devel] [PATCH v4 04/12] migration: avoid concurrent invoke channel_close by different threads Lidong Chen
2018-05-30 14:45 ` Dr. David Alan Gilbert
2018-05-31 7:07 ` 858585 jemmy
2018-05-31 10:52 ` Dr. David Alan Gilbert [this message]
2018-06-03 13:50 ` 858585 jemmy
2018-06-03 14:43 ` 858585 jemmy
2018-05-30 9:43 ` [Qemu-devel] [PATCH v4 05/12] migration: implement bi-directional RDMA QIOChannel Lidong Chen
2018-05-30 9:43 ` [Qemu-devel] [PATCH v4 06/12] migration: Stop rdma yielding during incoming postcopy Lidong Chen
2018-05-30 9:43 ` [Qemu-devel] [PATCH v4 07/12] migration: not wait RDMA_CM_EVENT_DISCONNECTED event after rdma_disconnect Lidong Chen
2018-05-30 12:24 ` Dr. David Alan Gilbert
2018-05-30 9:43 ` [Qemu-devel] [PATCH v4 08/12] migration: implement io_set_aio_fd_handler function for RDMA QIOChannel Lidong Chen
2018-05-30 9:43 ` [Qemu-devel] [PATCH v4 09/12] migration: invoke qio_channel_yield only when qemu_in_coroutine() Lidong Chen
2018-05-30 9:43 ` [Qemu-devel] [PATCH v4 10/12] migration: create a dedicated thread to release rdma resource Lidong Chen
2018-05-30 16:50 ` Dr. David Alan Gilbert
2018-05-31 7:25 ` 858585 jemmy
2018-05-31 10:55 ` Dr. David Alan Gilbert
2018-05-31 11:27 ` 858585 jemmy
2018-05-30 9:43 ` [Qemu-devel] [PATCH v4 11/12] migration: poll the cm event while wait RDMA work request completion Lidong Chen
2018-05-30 17:33 ` Dr. David Alan Gilbert
2018-05-31 7:36 ` 858585 jemmy
2018-06-03 15:04 ` Aviad Yehezkel
2018-06-05 14:26 ` 858585 jemmy
2018-05-30 9:43 ` [Qemu-devel] [PATCH v4 12/12] migration: implement the shutdown for RDMA QIOChannel Lidong Chen
2018-05-30 17:59 ` Dr. David Alan Gilbert
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20180531105243.GB2755@work-vm \
--to=dgilbert@redhat.com \
--cc=adido@mellanox.com \
--cc=aviadye@mellanox.com \
--cc=berrange@redhat.com \
--cc=jemmy858585@gmail.com \
--cc=lidongchen@tencent.com \
--cc=pbonzini@redhat.com \
--cc=qemu-devel@nongnu.org \
--cc=quintela@redhat.com \
--cc=zhang.zhanghailiang@huawei.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.