qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
From: Fabiano Rosas <farosas@suse.de>
To: Peter Xu <peterx@redhat.com>
Cc: qemu-devel@nongnu.org, berrange@redhat.com, armbru@redhat.com,
	Claudio Fontana <cfontana@suse.de>, Jim Fehlig <jfehlig@suse.com>,
	"Dr. David Alan Gilbert" <dave@treblig.org>
Subject: Re: [PATCH v2 06/18] monitor: Stop removing non-duplicated fds
Date: Fri, 31 May 2024 12:25:52 -0300	[thread overview]
Message-ID: <87o78mknpb.fsf@suse.de> (raw)
In-Reply-To: <ZljqGitCeG9-Fi9l@x1n>

Peter Xu <peterx@redhat.com> writes:

> On Thu, May 23, 2024 at 04:05:36PM -0300, Fabiano Rosas wrote:
>> We've been up until now cleaning up any file descriptors that have
>> been passed into QEMU and never duplicated[1,2]. A file descriptor
>> without duplicates indicates that no part of QEMU has made use of
>> it. This approach is starting to show some cracks now that we're
>> starting to consume fds from the migration code:
>> 
>> - Doing cleanup every time the last monitor connection closes works to
>>   reap unused fds, but also has the side effect of forcing the
>>   management layer to pass the file descriptors again in case of a
>>   disconnect/re-connect, if that happened to be the only monitor
>>   connection.
>> 
>>   Another side effect is that removing an fd with qmp_remove_fd() is
>>   effectively delayed until the last monitor connection closes.
>> 
>>   The reliance on mon_refcount is also problematic because it's racy.
>> 
>> - Checking runstate_is_running() skips the cleanup unless the VM is
>>   running and avoids premature cleanup of the fds, but also has the
>>   side effect of blocking the legitimate removal of an fd via
>>   qmp_remove_fd() if the VM happens to be in another state.
>> 
>>   This affects qmp_remove_fd() and qmp_query_fdsets() in particular
>>   because requesting a removal at a bad time (guest stopped) might
>>   cause an fd to never be removed, or to be removed at a much later
>>   point in time, causing the query command to continue showing the
>>   supposedly removed fd/fdset.
>> 
>> Note that file descriptors that *have* been duplicated are owned by
>> the code that uses them and will be removed after qemu_close() is
>> called. Therefore we've decided that the best course of action to
>> avoid the undesired side-effects is to stop managing non-duplicated
>> file descriptors.
>> 
>> 1- efb87c1697 ("monitor: Clean up fd sets on monitor disconnect")
>> 2- ebe52b592d ("monitor: Prevent removing fd from set during init")
>> 
>> Signed-off-by: Fabiano Rosas <farosas@suse.de>
>> ---
>>  monitor/fds.c              | 15 ++++++++-------
>>  monitor/hmp.c              |  2 --
>>  monitor/monitor-internal.h |  1 -
>>  monitor/qmp.c              |  2 --
>>  4 files changed, 8 insertions(+), 12 deletions(-)
>> 
>> diff --git a/monitor/fds.c b/monitor/fds.c
>> index 101e21720d..f7b98814fa 100644
>> --- a/monitor/fds.c
>> +++ b/monitor/fds.c
>> @@ -169,6 +169,11 @@ int monitor_get_fd(Monitor *mon, const char *fdname, Error **errp)
>>  
>>  static void monitor_fdset_free(MonFdset *mon_fdset)
>>  {
>> +    /*
>> +     * Only remove an empty fdset. The fds are owned by the user and
>> +     * should have been removed with qmp_remove_fd(). The dup_fds are
>> +     * owned by QEMU and should have been removed with qemu_close().
>> +     */
>>      if (QLIST_EMPTY(&mon_fdset->fds) && QLIST_EMPTY(&mon_fdset->dup_fds)) {
>>          QLIST_REMOVE(mon_fdset, next);
>>          g_free(mon_fdset);
>> @@ -189,9 +194,7 @@ static void monitor_fdset_cleanup(MonFdset *mon_fdset)
>>      MonFdsetFd *mon_fdset_fd_next;
>>  
>>      QLIST_FOREACH_SAFE(mon_fdset_fd, &mon_fdset->fds, next, mon_fdset_fd_next) {
>> -        if ((mon_fdset_fd->removed ||
>> -                (QLIST_EMPTY(&mon_fdset->dup_fds) && mon_refcount == 0)) &&
>> -                runstate_is_running()) {
>> +        if (mon_fdset_fd->removed) {
>
> I don't know the code well, but I like it.
>
>>              monitor_fdset_fd_free(mon_fdset_fd);
>>          }
>>      }
>> @@ -206,7 +209,7 @@ void monitor_fdsets_cleanup(void)
>>  
>>      QEMU_LOCK_GUARD(&mon_fdsets_lock);
>>      QLIST_FOREACH_SAFE(mon_fdset, &mon_fdsets, next, mon_fdset_next) {
>> -        monitor_fdset_cleanup(mon_fdset);
>> +        monitor_fdset_free(mon_fdset);
>>      }
>>  }
>>  
>> @@ -479,9 +482,7 @@ void monitor_fdset_dup_fd_remove(int dup_fd)
>>              if (mon_fdset_fd_dup->fd == dup_fd) {
>>                  QLIST_REMOVE(mon_fdset_fd_dup, next);
>>                  g_free(mon_fdset_fd_dup);
>> -                if (QLIST_EMPTY(&mon_fdset->dup_fds)) {
>> -                    monitor_fdset_cleanup(mon_fdset);
>> -                }
>> +                monitor_fdset_free(mon_fdset);
>
> This and above changes are not crystal clear to me where the _cleanup()
> does extra check "removed" and clean those fds.
>
> I think it'll make it easier for me to understand if this one and the next
> are squashed together.  But maybe it's simply because I didn't fully
> understand.

monitor_fdsets_cleanup() was doing three things previously:

1- Remove the removed=true fds. This is weird, but ok.

2- Remove fds from an fdset that has an empty dup_fds list, but only if
the guest is not running and the monitor is closed. This is problematic.

3- Remove/free fdsets that have become empty due to the above
removals. This is ok.

This patch covers (2), because that is the only change that has a
complex reasoning behind it and we need to document that without
conflating it with the rest of the changes.

Since (3) is still a reasonable thing to do, this patch merely keeps it
around, but using the helpers introduced in the previous patch.

The next patch removed the need for (1), making the removal immediate
instead of delayed. It has it's own much less complex reasoning, which
is: "we don't need to wait to remove the fd".

I hope this clarifies a bit. I would prefer if we kept this and the next
patch separate to avoid having a single patch that does too many
things. I hope to be as explicit as possible with the reason behind
these changes to avoid putting future people in the situation that we
are in now, i.e. having to guess at the reasons behind this code.

If you still think we should squash or if you have more questions, let
me know.

>>                  return;
>>              }
>>          }
>> diff --git a/monitor/hmp.c b/monitor/hmp.c
>> index 69c1b7e98a..460e8832f6 100644
>> --- a/monitor/hmp.c
>> +++ b/monitor/hmp.c
>> @@ -1437,11 +1437,9 @@ static void monitor_event(void *opaque, QEMUChrEvent event)
>>              monitor_resume(mon);
>>          }
>>          qemu_mutex_unlock(&mon->mon_lock);
>> -        mon_refcount++;
>>          break;
>>  
>>      case CHR_EVENT_CLOSED:
>> -        mon_refcount--;
>>          monitor_fdsets_cleanup();
>>          break;
>>  
>> diff --git a/monitor/monitor-internal.h b/monitor/monitor-internal.h
>> index 252de85681..cb628f681d 100644
>> --- a/monitor/monitor-internal.h
>> +++ b/monitor/monitor-internal.h
>> @@ -168,7 +168,6 @@ extern bool qmp_dispatcher_co_shutdown;
>>  extern QmpCommandList qmp_commands, qmp_cap_negotiation_commands;
>>  extern QemuMutex monitor_lock;
>>  extern MonitorList mon_list;
>> -extern int mon_refcount;
>>  
>>  extern HMPCommand hmp_cmds[];
>>  
>> diff --git a/monitor/qmp.c b/monitor/qmp.c
>> index a239945e8d..5e538f34c0 100644
>> --- a/monitor/qmp.c
>> +++ b/monitor/qmp.c
>> @@ -466,7 +466,6 @@ static void monitor_qmp_event(void *opaque, QEMUChrEvent event)
>>          data = qmp_greeting(mon);
>>          qmp_send_response(mon, data);
>>          qobject_unref(data);
>> -        mon_refcount++;
>>          break;
>>      case CHR_EVENT_CLOSED:
>>          /*
>> @@ -479,7 +478,6 @@ static void monitor_qmp_event(void *opaque, QEMUChrEvent event)
>>          json_message_parser_destroy(&mon->parser);
>>          json_message_parser_init(&mon->parser, handle_qmp_command,
>>                                   mon, NULL);
>> -        mon_refcount--;
>>          monitor_fdsets_cleanup();
>>          break;
>>      case CHR_EVENT_BREAK:
>
> I like this too when mon_refcount can be dropped.  It turns out I like this
> patch and the next a lot, and I hope nothing will break.
>
> Aside, you seem to have forgot removal of the "int mon_refcount" in
> monitor.c.

Yes, I'll fix that. Thanks.


  reply	other threads:[~2024-05-31 15:26 UTC|newest]

Thread overview: 60+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-05-23 19:05 [PATCH v2 00/18] migration/mapped-ram: Add direct-io support Fabiano Rosas
2024-05-23 19:05 ` [PATCH v2 01/18] migration: Fix file migration with fdset Fabiano Rosas
2024-05-24 10:51   ` Prasad Pandit
2024-05-24 12:30     ` Fabiano Rosas
2024-05-25  6:16       ` Prasad Pandit
2024-05-30 16:11   ` Peter Xu
2024-05-31 14:58     ` Fabiano Rosas
2024-06-03 10:20   ` Daniel P. Berrangé
2024-05-23 19:05 ` [PATCH v2 02/18] tests/qtest/migration: Fix file migration offset check Fabiano Rosas
2024-05-30 16:14   ` Peter Xu
2024-06-03 10:21   ` Daniel P. Berrangé
2024-05-23 19:05 ` [PATCH v2 03/18] tests/qtest/migration: Add a precopy file test with fdset Fabiano Rosas
2024-05-30 16:18   ` Peter Xu
2024-05-23 19:05 ` [PATCH v2 04/18] monitor: Drop monitor_fdset_dup_fd_add() Fabiano Rosas
2024-06-03 10:26   ` Daniel P. Berrangé
2024-05-23 19:05 ` [PATCH v2 05/18] monitor: Introduce monitor_fdset_*free Fabiano Rosas
2024-05-30 20:03   ` Peter Xu
2024-05-31 15:01     ` Fabiano Rosas
2024-05-23 19:05 ` [PATCH v2 06/18] monitor: Stop removing non-duplicated fds Fabiano Rosas
2024-05-30 21:05   ` Peter Xu
2024-05-31 15:25     ` Fabiano Rosas [this message]
2024-05-31 15:56       ` Peter Xu
2024-06-04 23:40       ` Dr. David Alan Gilbert
2024-06-05 12:31         ` Fabiano Rosas
2024-05-23 19:05 ` [PATCH v2 07/18] monitor: Simplify fdset and fd removal Fabiano Rosas
2024-05-31 15:58   ` Peter Xu
2024-05-23 19:05 ` [PATCH v2 08/18] monitor: Report errors from monitor_fdset_dup_fd_add Fabiano Rosas
2024-05-30 21:08   ` Peter Xu
2024-05-23 19:05 ` [PATCH v2 09/18] io: Stop using qemu_open_old in channel-file Fabiano Rosas
2024-05-30 21:10   ` Peter Xu
2024-05-23 19:05 ` [PATCH v2 10/18] migration: Add direct-io parameter Fabiano Rosas
2024-05-30 21:12   ` Peter Xu
2024-05-23 19:05 ` [PATCH v2 11/18] migration/multifd: Add direct-io support Fabiano Rosas
2024-05-30 21:35   ` Peter Xu
2024-05-31 15:27     ` Fabiano Rosas
2024-05-23 19:05 ` [PATCH v2 12/18] tests/qtest/migration: Add tests for file migration with direct-io Fabiano Rosas
2024-05-23 19:05 ` [PATCH v2 13/18] monitor: fdset: Match against O_DIRECT Fabiano Rosas
2024-05-30 21:41   ` Peter Xu
2024-05-31 15:42     ` Fabiano Rosas
2024-05-31 15:58       ` Peter Xu
2024-05-23 19:05 ` [PATCH v2 14/18] migration: Add documentation for fdset with multifd + file Fabiano Rosas
2024-06-04 20:46   ` Peter Xu
2024-05-23 19:05 ` [PATCH v2 15/18] tests/qtest/migration: Add a test for mapped-ram with passing of fds Fabiano Rosas
2024-06-04 20:51   ` Peter Xu
2024-05-23 19:05 ` [PATCH v2 16/18] io/channel-file: Add direct-io support Fabiano Rosas
2024-06-03 10:32   ` Daniel P. Berrangé
2024-05-23 19:05 ` [PATCH v2 17/18] migration: Add direct-io helpers Fabiano Rosas
2024-05-23 19:05 ` [PATCH v2 18/18] migration/ram: Add direct-io support to precopy file migration Fabiano Rosas
2024-06-04 20:56   ` Peter Xu
2024-06-07 18:42     ` Fabiano Rosas
2024-06-07 20:39       ` Jim Fehlig
2024-06-10 16:09       ` Peter Xu
2024-06-10 17:45         ` Fabiano Rosas
2024-06-10 19:02           ` Peter Xu
2024-06-10 19:07             ` Daniel P. Berrangé
2024-06-10 20:12             ` Fabiano Rosas
2024-06-12 18:08               ` Fabiano Rosas
2024-06-12 18:15                 ` Daniel P. Berrangé
2024-06-12 18:27                   ` Peter Xu
2024-06-12 18:44                     ` Fabiano Rosas

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=87o78mknpb.fsf@suse.de \
    --to=farosas@suse.de \
    --cc=armbru@redhat.com \
    --cc=berrange@redhat.com \
    --cc=cfontana@suse.de \
    --cc=dave@treblig.org \
    --cc=jfehlig@suse.com \
    --cc=peterx@redhat.com \
    --cc=qemu-devel@nongnu.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).