From: "Dr. David Alan Gilbert" <dgilbert@redhat.com>
To: Max Reitz <mreitz@redhat.com>
Cc: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>,
"qemu-devel@nongnu.org" <qemu-devel@nongnu.org>,
Juan Quintela <quintela@redhat.com>
Subject: Re: Properly quitting qemu immediately after failing migration
Date: Mon, 29 Jun 2020 16:41:04 +0100 [thread overview]
Message-ID: <20200629154104.GK2908@work-vm> (raw)
In-Reply-To: <0dce6c63-4b83-8b1a-6d00-07235f637997@redhat.com>
* Max Reitz (mreitz@redhat.com) wrote:
> Hi,
>
> In an iotest, I’m trying to quit qemu immediately after a migration has
> failed. Unfortunately, that doesn’t seem to be possible in a clean way:
> migrate_fd_cleanup() runs only at some point after the migration state
> is already “failed”, so if I just wait for that “failed” state and
> immediately quit, some cleanup functions may not have been run yet.
Yeh this is hard; I always take the end of migrate_fd_cleanup to be the
real end.
It always happens on the main thread I think (it's done as a bh in some
cases).
> This is a problem with dirty bitmap migration at least, because it
> increases the refcount on all block devices that are to be migrated, so
> if we don’t call the cleanup function before quitting, the refcount will
> stay elevated and bdrv_close_all() will hit an assertion because those
> block devices are still around after blk_remove_all_bs() and
> blockdev_close_all_bdrv_states().
>
> In practice this particular issue might not be that big of a problem,
> because it just means qemu aborts when the user intended to let it quit
> anyway. But on one hand I could imagine that there are other clean-up
> paths that should definitely run before qemu quits (although I don’t
> know), and on the other, it’s a problem for my test.
'quit' varies - there are a lot of incoming failures that just assert;
very few of them cause a clean exit (I think there are more clean ones
after Peter's work on restartable postcopy a year or two ago).
I do see the end of migrate_fd_cleanup calls the notifier list; but it's
not clear to me that it's alwyas going to see the first transition to
'failed' at that point.
> I tried working around the problem for my test by waiting on “Unable to
> write” appearing on stderr, because that indicates that
> migrate_fd_cleanup()’s error_report_err() has been reached. But on one
> hand, that isn’t really nice, and on the other, it doesn’t even work
> when the failure is on the source side (because then there is no
> s->error for migrate_fd_cleanup() to report).
>
> In all, I’m asking:
> (1) Is there a nice solution for me now to delay quitting qemu until the
> failed migration has been fully resolved, including the clean-up?
In vl.c, I added a call to migration_shutdown in qemu_cleanup - although
that seems to be mostly about cleaning up the *outgoing* side; you could
add some incoming cleanup there.
> (2) Isn’t it a problem if qemu crashes when you issue “quit” via QMP at
> the wrong time? Like, maybe lingering subprocesses when using “exec”?
Yeh that should be cleaner, but isn't.
Dave
>
> Thanks,
>
> Max
>
--
Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK
next prev parent reply other threads:[~2020-06-29 15:42 UTC|newest]
Thread overview: 12+ messages / expand[flat|nested] mbox.gz Atom feed top
2020-06-29 13:48 Properly quitting qemu immediately after failing migration Max Reitz
2020-06-29 14:18 ` Vladimir Sementsov-Ogievskiy
2020-06-29 15:00 ` Max Reitz
2020-07-01 16:16 ` Vladimir Sementsov-Ogievskiy
2020-07-02 7:23 ` Max Reitz
2020-07-02 11:44 ` Vladimir Sementsov-Ogievskiy
2020-07-02 12:57 ` Vladimir Sementsov-Ogievskiy
2020-06-29 15:41 ` Dr. David Alan Gilbert [this message]
2020-06-29 16:08 ` Max Reitz
2020-06-29 16:46 ` Dr. David Alan Gilbert
2020-06-29 15:45 ` Daniel P. Berrangé
2020-06-29 16:00 ` Max Reitz
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20200629154104.GK2908@work-vm \
--to=dgilbert@redhat.com \
--cc=mreitz@redhat.com \
--cc=qemu-devel@nongnu.org \
--cc=quintela@redhat.com \
--cc=vsementsov@virtuozzo.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.