From: "Dr. David Alan Gilbert" <dgilbert@redhat.com>
To: Peter Xu <peterx@redhat.com>
Cc: Peter Maydell <peter.maydell@linaro.org>,
Lukas Straub <lukasstraub2@web.de>,
"Daniel P . Berrange" <berrange@redhat.com>,
Juan Quintela <quintela@redhat.com>,
qemu-devel@nongnu.org,
Leonardo Bras Soares Passos <lsoaresp@redhat.com>,
Eric Blake <eblake@redhat.com>
Subject: Re: [PATCH v3 0/5] migrations: Fix potential rare race of migration-test after yank
Date: Mon, 26 Jul 2021 12:45:36 +0100 [thread overview]
Message-ID: <YP6gYODsPyyQIDyV@work-vm> (raw)
In-Reply-To: <20210722175841.938739-1-peterx@redhat.com>
* Peter Xu (peterx@redhat.com) wrote:
> v3:
> - Use WITH_QEMU_LOCK_GUARD() for patch 2 [Eric]
> (potentially I can also replace other existing uses of qemu_file_lock into
> WITH_QEMU_LOCK_GUARD, but I decided to took Dave's r-b first and leave that
> for later)
> - Added r-bs for Dave on patch 2/4
> - Add a comment in patch 5 to explain why it's safe to unregister yank without
> qemu_file_lock, in postcopy_pause() [Dave]
>
> v2:
> - Pick r-b for Dave on patch 1/3
> - Move migration_file_get_ioc() from patch 5 to patch 4, meanwhile rename it to
> qemu_file_get_ioc(). [Dave]
> - Patch 2 "migration: Shutdown src in await_return_path_close_on_source()" is
> replaced by patch "migration: Make from_dst_file accesses thread-safe" [Dave]
>
> Patch 1 fixes a possible race that migration thread can accidentally skip
> join() of rp_thread even if the return thread is enabled. Patch 1 is suspected
> to also be the root cause of the recent hard-to-reproduce migration-test
> failure here reported by PMM:
>
> https://lore.kernel.org/qemu-devel/YPamXAHwan%2FPPXLf@work-vm/
>
> I didn't reproduce it myself; but after co-debugged with Dave it's suspected
> that the race of rp_thread could be the cause. It's not exposed before because
> yank is soo strict on releasing instances, while we're not that strict before,
> and didn't join() on rp_thread wasn't so dangerous after all when migration
> succeeded before.
>
> Patch 2 fixes another theoretical race on accessing from_dst_file spotted by
> Dave. I don't think there's known issues with it, but may still worth fixing.
>
> Patch 3 should be a cleanup on yank that I think would be nice to have.
>
> Patch 4-5 are further cleanups to remove the ref==1 check in channel_close(),
> finally, as I always thought that's a bit hackish. So I used explicit
> unregister of the yank function at necessary places to replace that ref==1 one.
>
> I still think having patch 3-5 altogether would be great, however I think patch
> 1 should still be the most important to be reviewed. Also it would be great to
> know whether patch 1 could fix the known yank crash.
>
> Please review, thanks.
Queued with the fix; it survived over a long weekend running about 50k
times OK.
Dave
> Peter Xu (5):
> migration: Fix missing join() of rp_thread
> migration: Make from_dst_file accesses thread-safe
> migration: Introduce migration_ioc_[un]register_yank()
> migration: Teach QEMUFile to be QIOChannel-aware
> migration: Move the yank unregister of channel_close out
>
> migration/channel.c | 15 ++-------
> migration/migration.c | 57 +++++++++++++++++++++++++++--------
> migration/migration.h | 15 +++++++--
> migration/multifd.c | 8 ++---
> migration/qemu-file-channel.c | 11 ++-----
> migration/qemu-file.c | 17 ++++++++++-
> migration/qemu-file.h | 4 ++-
> migration/ram.c | 3 +-
> migration/savevm.c | 11 +++++--
> migration/yank_functions.c | 42 ++++++++++++++++++++++++++
> migration/yank_functions.h | 3 ++
> 11 files changed, 138 insertions(+), 48 deletions(-)
>
> --
> 2.31.1
>
>
>
--
Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK
prev parent reply other threads:[~2021-07-26 11:47 UTC|newest]
Thread overview: 10+ messages / expand[flat|nested] mbox.gz Atom feed top
2021-07-22 17:58 [PATCH v3 0/5] migrations: Fix potential rare race of migration-test after yank Peter Xu
2021-07-22 17:58 ` [PATCH v3 1/5] migration: Fix missing join() of rp_thread Peter Xu
2021-07-22 17:58 ` [PATCH v3 2/5] migration: Make from_dst_file accesses thread-safe Peter Xu
2021-07-22 19:32 ` Peter Xu
2021-07-22 17:58 ` [PATCH v3 3/5] migration: Introduce migration_ioc_[un]register_yank() Peter Xu
2021-07-22 17:58 ` [PATCH v3 4/5] migration: Teach QEMUFile to be QIOChannel-aware Peter Xu
2021-07-22 17:58 ` [PATCH v3 5/5] migration: Move the yank unregister of channel_close out Peter Xu
2021-07-22 18:41 ` Dr. David Alan Gilbert
2021-07-24 17:21 ` [PATCH v3 0/5] migrations: Fix potential rare race of migration-test after yank Lukas Straub
2021-07-26 11:45 ` Dr. David Alan Gilbert [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=YP6gYODsPyyQIDyV@work-vm \
--to=dgilbert@redhat.com \
--cc=berrange@redhat.com \
--cc=eblake@redhat.com \
--cc=lsoaresp@redhat.com \
--cc=lukasstraub2@web.de \
--cc=peter.maydell@linaro.org \
--cc=peterx@redhat.com \
--cc=qemu-devel@nongnu.org \
--cc=quintela@redhat.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.