From: Peter Xu <peterx@redhat.com>
To: Fabiano Rosas <farosas@suse.de>
Cc: qemu-devel@nongnu.org, berrange@redhat.com, armbru@redhat.com,
Claudio Fontana <cfontana@suse.de>, Jim Fehlig <jfehlig@suse.com>,
"Dr. David Alan Gilbert" <dave@treblig.org>
Subject: Re: [PATCH v2 06/18] monitor: Stop removing non-duplicated fds
Date: Fri, 31 May 2024 11:56:20 -0400 [thread overview]
Message-ID: <ZlnzJIuUPbN8zQR4@x1n> (raw)
In-Reply-To: <87o78mknpb.fsf@suse.de>
On Fri, May 31, 2024 at 12:25:52PM -0300, Fabiano Rosas wrote:
> Peter Xu <peterx@redhat.com> writes:
>
> > On Thu, May 23, 2024 at 04:05:36PM -0300, Fabiano Rosas wrote:
> >> We've been up until now cleaning up any file descriptors that have
> >> been passed into QEMU and never duplicated[1,2]. A file descriptor
> >> without duplicates indicates that no part of QEMU has made use of
> >> it. This approach is starting to show some cracks now that we're
> >> starting to consume fds from the migration code:
> >>
> >> - Doing cleanup every time the last monitor connection closes works to
> >> reap unused fds, but also has the side effect of forcing the
> >> management layer to pass the file descriptors again in case of a
> >> disconnect/re-connect, if that happened to be the only monitor
> >> connection.
> >>
> >> Another side effect is that removing an fd with qmp_remove_fd() is
> >> effectively delayed until the last monitor connection closes.
> >>
> >> The reliance on mon_refcount is also problematic because it's racy.
> >>
> >> - Checking runstate_is_running() skips the cleanup unless the VM is
> >> running and avoids premature cleanup of the fds, but also has the
> >> side effect of blocking the legitimate removal of an fd via
> >> qmp_remove_fd() if the VM happens to be in another state.
> >>
> >> This affects qmp_remove_fd() and qmp_query_fdsets() in particular
> >> because requesting a removal at a bad time (guest stopped) might
> >> cause an fd to never be removed, or to be removed at a much later
> >> point in time, causing the query command to continue showing the
> >> supposedly removed fd/fdset.
> >>
> >> Note that file descriptors that *have* been duplicated are owned by
> >> the code that uses them and will be removed after qemu_close() is
> >> called. Therefore we've decided that the best course of action to
> >> avoid the undesired side-effects is to stop managing non-duplicated
> >> file descriptors.
> >>
> >> 1- efb87c1697 ("monitor: Clean up fd sets on monitor disconnect")
> >> 2- ebe52b592d ("monitor: Prevent removing fd from set during init")
> >>
> >> Signed-off-by: Fabiano Rosas <farosas@suse.de>
> >> ---
> >> monitor/fds.c | 15 ++++++++-------
> >> monitor/hmp.c | 2 --
> >> monitor/monitor-internal.h | 1 -
> >> monitor/qmp.c | 2 --
> >> 4 files changed, 8 insertions(+), 12 deletions(-)
> >>
> >> diff --git a/monitor/fds.c b/monitor/fds.c
> >> index 101e21720d..f7b98814fa 100644
> >> --- a/monitor/fds.c
> >> +++ b/monitor/fds.c
> >> @@ -169,6 +169,11 @@ int monitor_get_fd(Monitor *mon, const char *fdname, Error **errp)
> >>
> >> static void monitor_fdset_free(MonFdset *mon_fdset)
> >> {
> >> + /*
> >> + * Only remove an empty fdset. The fds are owned by the user and
> >> + * should have been removed with qmp_remove_fd(). The dup_fds are
> >> + * owned by QEMU and should have been removed with qemu_close().
> >> + */
> >> if (QLIST_EMPTY(&mon_fdset->fds) && QLIST_EMPTY(&mon_fdset->dup_fds)) {
> >> QLIST_REMOVE(mon_fdset, next);
> >> g_free(mon_fdset);
> >> @@ -189,9 +194,7 @@ static void monitor_fdset_cleanup(MonFdset *mon_fdset)
> >> MonFdsetFd *mon_fdset_fd_next;
> >>
> >> QLIST_FOREACH_SAFE(mon_fdset_fd, &mon_fdset->fds, next, mon_fdset_fd_next) {
> >> - if ((mon_fdset_fd->removed ||
> >> - (QLIST_EMPTY(&mon_fdset->dup_fds) && mon_refcount == 0)) &&
> >> - runstate_is_running()) {
> >> + if (mon_fdset_fd->removed) {
> >
> > I don't know the code well, but I like it.
> >
> >> monitor_fdset_fd_free(mon_fdset_fd);
> >> }
> >> }
> >> @@ -206,7 +209,7 @@ void monitor_fdsets_cleanup(void)
> >>
> >> QEMU_LOCK_GUARD(&mon_fdsets_lock);
> >> QLIST_FOREACH_SAFE(mon_fdset, &mon_fdsets, next, mon_fdset_next) {
> >> - monitor_fdset_cleanup(mon_fdset);
> >> + monitor_fdset_free(mon_fdset);
> >> }
> >> }
> >>
> >> @@ -479,9 +482,7 @@ void monitor_fdset_dup_fd_remove(int dup_fd)
> >> if (mon_fdset_fd_dup->fd == dup_fd) {
> >> QLIST_REMOVE(mon_fdset_fd_dup, next);
> >> g_free(mon_fdset_fd_dup);
> >> - if (QLIST_EMPTY(&mon_fdset->dup_fds)) {
> >> - monitor_fdset_cleanup(mon_fdset);
> >> - }
> >> + monitor_fdset_free(mon_fdset);
> >
> > This and above changes are not crystal clear to me where the _cleanup()
> > does extra check "removed" and clean those fds.
> >
> > I think it'll make it easier for me to understand if this one and the next
> > are squashed together. But maybe it's simply because I didn't fully
> > understand.
>
> monitor_fdsets_cleanup() was doing three things previously:
>
> 1- Remove the removed=true fds. This is weird, but ok.
>
> 2- Remove fds from an fdset that has an empty dup_fds list, but only if
> the guest is not running and the monitor is closed. This is problematic.
>
> 3- Remove/free fdsets that have become empty due to the above
> removals. This is ok.
>
> This patch covers (2), because that is the only change that has a
> complex reasoning behind it and we need to document that without
> conflating it with the rest of the changes.
>
> Since (3) is still a reasonable thing to do, this patch merely keeps it
> around, but using the helpers introduced in the previous patch.
>
> The next patch removed the need for (1), making the removal immediate
> instead of delayed. It has it's own much less complex reasoning, which
> is: "we don't need to wait to remove the fd".
>
> I hope this clarifies a bit. I would prefer if we kept this and the next
> patch separate to avoid having a single patch that does too many
> things. I hope to be as explicit as possible with the reason behind
> these changes to avoid putting future people in the situation that we
> are in now, i.e. having to guess at the reasons behind this code.
>
> If you still think we should squash or if you have more questions, let
> me know.
Thanks. Mind adding something into the commit message for monitor newbies
(like myself)?
I hope whoever more familiar with monitor can look, but otherwise let's see
what will break and then we have someone to discuss with.
For what it worth, I still want to ack this:
Reviewed-by: Peter Xu <peterx@redhat.com>
>
> >> return;
> >> }
> >> }
> >> diff --git a/monitor/hmp.c b/monitor/hmp.c
> >> index 69c1b7e98a..460e8832f6 100644
> >> --- a/monitor/hmp.c
> >> +++ b/monitor/hmp.c
> >> @@ -1437,11 +1437,9 @@ static void monitor_event(void *opaque, QEMUChrEvent event)
> >> monitor_resume(mon);
> >> }
> >> qemu_mutex_unlock(&mon->mon_lock);
> >> - mon_refcount++;
> >> break;
> >>
> >> case CHR_EVENT_CLOSED:
> >> - mon_refcount--;
> >> monitor_fdsets_cleanup();
> >> break;
> >>
> >> diff --git a/monitor/monitor-internal.h b/monitor/monitor-internal.h
> >> index 252de85681..cb628f681d 100644
> >> --- a/monitor/monitor-internal.h
> >> +++ b/monitor/monitor-internal.h
> >> @@ -168,7 +168,6 @@ extern bool qmp_dispatcher_co_shutdown;
> >> extern QmpCommandList qmp_commands, qmp_cap_negotiation_commands;
> >> extern QemuMutex monitor_lock;
> >> extern MonitorList mon_list;
> >> -extern int mon_refcount;
> >>
> >> extern HMPCommand hmp_cmds[];
> >>
> >> diff --git a/monitor/qmp.c b/monitor/qmp.c
> >> index a239945e8d..5e538f34c0 100644
> >> --- a/monitor/qmp.c
> >> +++ b/monitor/qmp.c
> >> @@ -466,7 +466,6 @@ static void monitor_qmp_event(void *opaque, QEMUChrEvent event)
> >> data = qmp_greeting(mon);
> >> qmp_send_response(mon, data);
> >> qobject_unref(data);
> >> - mon_refcount++;
> >> break;
> >> case CHR_EVENT_CLOSED:
> >> /*
> >> @@ -479,7 +478,6 @@ static void monitor_qmp_event(void *opaque, QEMUChrEvent event)
> >> json_message_parser_destroy(&mon->parser);
> >> json_message_parser_init(&mon->parser, handle_qmp_command,
> >> mon, NULL);
> >> - mon_refcount--;
> >> monitor_fdsets_cleanup();
> >> break;
> >> case CHR_EVENT_BREAK:
> >
> > I like this too when mon_refcount can be dropped. It turns out I like this
> > patch and the next a lot, and I hope nothing will break.
> >
> > Aside, you seem to have forgot removal of the "int mon_refcount" in
> > monitor.c.
>
> Yes, I'll fix that. Thanks.
>
--
Peter Xu
next prev parent reply other threads:[~2024-05-31 15:57 UTC|newest]
Thread overview: 60+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-05-23 19:05 [PATCH v2 00/18] migration/mapped-ram: Add direct-io support Fabiano Rosas
2024-05-23 19:05 ` [PATCH v2 01/18] migration: Fix file migration with fdset Fabiano Rosas
2024-05-24 10:51 ` Prasad Pandit
2024-05-24 12:30 ` Fabiano Rosas
2024-05-25 6:16 ` Prasad Pandit
2024-05-30 16:11 ` Peter Xu
2024-05-31 14:58 ` Fabiano Rosas
2024-06-03 10:20 ` Daniel P. Berrangé
2024-05-23 19:05 ` [PATCH v2 02/18] tests/qtest/migration: Fix file migration offset check Fabiano Rosas
2024-05-30 16:14 ` Peter Xu
2024-06-03 10:21 ` Daniel P. Berrangé
2024-05-23 19:05 ` [PATCH v2 03/18] tests/qtest/migration: Add a precopy file test with fdset Fabiano Rosas
2024-05-30 16:18 ` Peter Xu
2024-05-23 19:05 ` [PATCH v2 04/18] monitor: Drop monitor_fdset_dup_fd_add() Fabiano Rosas
2024-06-03 10:26 ` Daniel P. Berrangé
2024-05-23 19:05 ` [PATCH v2 05/18] monitor: Introduce monitor_fdset_*free Fabiano Rosas
2024-05-30 20:03 ` Peter Xu
2024-05-31 15:01 ` Fabiano Rosas
2024-05-23 19:05 ` [PATCH v2 06/18] monitor: Stop removing non-duplicated fds Fabiano Rosas
2024-05-30 21:05 ` Peter Xu
2024-05-31 15:25 ` Fabiano Rosas
2024-05-31 15:56 ` Peter Xu [this message]
2024-06-04 23:40 ` Dr. David Alan Gilbert
2024-06-05 12:31 ` Fabiano Rosas
2024-05-23 19:05 ` [PATCH v2 07/18] monitor: Simplify fdset and fd removal Fabiano Rosas
2024-05-31 15:58 ` Peter Xu
2024-05-23 19:05 ` [PATCH v2 08/18] monitor: Report errors from monitor_fdset_dup_fd_add Fabiano Rosas
2024-05-30 21:08 ` Peter Xu
2024-05-23 19:05 ` [PATCH v2 09/18] io: Stop using qemu_open_old in channel-file Fabiano Rosas
2024-05-30 21:10 ` Peter Xu
2024-05-23 19:05 ` [PATCH v2 10/18] migration: Add direct-io parameter Fabiano Rosas
2024-05-30 21:12 ` Peter Xu
2024-05-23 19:05 ` [PATCH v2 11/18] migration/multifd: Add direct-io support Fabiano Rosas
2024-05-30 21:35 ` Peter Xu
2024-05-31 15:27 ` Fabiano Rosas
2024-05-23 19:05 ` [PATCH v2 12/18] tests/qtest/migration: Add tests for file migration with direct-io Fabiano Rosas
2024-05-23 19:05 ` [PATCH v2 13/18] monitor: fdset: Match against O_DIRECT Fabiano Rosas
2024-05-30 21:41 ` Peter Xu
2024-05-31 15:42 ` Fabiano Rosas
2024-05-31 15:58 ` Peter Xu
2024-05-23 19:05 ` [PATCH v2 14/18] migration: Add documentation for fdset with multifd + file Fabiano Rosas
2024-06-04 20:46 ` Peter Xu
2024-05-23 19:05 ` [PATCH v2 15/18] tests/qtest/migration: Add a test for mapped-ram with passing of fds Fabiano Rosas
2024-06-04 20:51 ` Peter Xu
2024-05-23 19:05 ` [PATCH v2 16/18] io/channel-file: Add direct-io support Fabiano Rosas
2024-06-03 10:32 ` Daniel P. Berrangé
2024-05-23 19:05 ` [PATCH v2 17/18] migration: Add direct-io helpers Fabiano Rosas
2024-05-23 19:05 ` [PATCH v2 18/18] migration/ram: Add direct-io support to precopy file migration Fabiano Rosas
2024-06-04 20:56 ` Peter Xu
2024-06-07 18:42 ` Fabiano Rosas
2024-06-07 20:39 ` Jim Fehlig
2024-06-10 16:09 ` Peter Xu
2024-06-10 17:45 ` Fabiano Rosas
2024-06-10 19:02 ` Peter Xu
2024-06-10 19:07 ` Daniel P. Berrangé
2024-06-10 20:12 ` Fabiano Rosas
2024-06-12 18:08 ` Fabiano Rosas
2024-06-12 18:15 ` Daniel P. Berrangé
2024-06-12 18:27 ` Peter Xu
2024-06-12 18:44 ` Fabiano Rosas
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=ZlnzJIuUPbN8zQR4@x1n \
--to=peterx@redhat.com \
--cc=armbru@redhat.com \
--cc=berrange@redhat.com \
--cc=cfontana@suse.de \
--cc=dave@treblig.org \
--cc=farosas@suse.de \
--cc=jfehlig@suse.com \
--cc=qemu-devel@nongnu.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).