qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
From: Fabiano Rosas <farosas@suse.de>
To: Peter Xu <peterx@redhat.com>
Cc: qemu-devel@nongnu.org, berrange@redhat.com, armbru@redhat.com,
	Claudio Fontana <cfontana@suse.de>, Jim Fehlig <jfehlig@suse.com>
Subject: Re: [PATCH 2/9] migration: Fix file migration with fdset
Date: Fri, 03 May 2024 18:31:06 -0300	[thread overview]
Message-ID: <87plu2mvl1.fsf@suse.de> (raw)
In-Reply-To: <ZjVRS6yT6n7_wb0V@x1n>

Peter Xu <peterx@redhat.com> writes:

> On Fri, May 03, 2024 at 04:56:08PM -0300, Fabiano Rosas wrote:
>> Peter Xu <peterx@redhat.com> writes:
>> 
>> > On Fri, Apr 26, 2024 at 11:20:35AM -0300, Fabiano Rosas wrote:
>> >> When the migration using the "file:" URI was implemented, I don't
>> >> think any of us noticed that if you pass in a file name with the
>> >> format "/dev/fdset/N", this allows a file descriptor to be passed in
>> >> to QEMU and that behaves just like the "fd:" URI. So the "file:"
>> >> support has been added without regard for the fdset part and we got
>> >> some things wrong.
>> >> 
>> >> The first issue is that we should not truncate the migration file if
>> >> we're allowing an fd + offset. We need to leave the file contents
>> >> untouched.
>> >
>> > I'm wondering whether we can use fallocate() instead on the ranges so that
>> > we always don't open() with O_TRUNC.  Before that..  could you remind me
>> > why do we need to truncate in the first place?  I definitely missed
>> > something else here too.
>> 
>> AFAIK, just to avoid any issues if the file is pre-existing. I don't see
>> the difference between O_TRUNC and fallocate in this case.
>
> Then, shall we avoid truncations at all, leaving all the feasibility to
> user (also errors prone to make)?
>

Is this a big deal? I'd rather close that possible gap and avoid the bug
reports.

>> 
>> >
>> >> 
>> >> The second issue is that there's an expectation that QEMU removes the
>> >> fd after the migration has finished. That's what the "fd:" code
>> >> does. Otherwise a second migration on the same VM could attempt to
>> >> provide an fdset with the same name and QEMU would reject it.
>> >
>> > Let me check what we do when with "fd:" and when migration completes or
>> > cancels.
>> >
>> > IIUC it's qio_channel_file_close() that does the final cleanup work on
>> > e.g. to_dst_file, right?  Then there's qemu_close(), and it has:
>> >
>> >     /* Close fd that was dup'd from an fdset */
>> >     fdset_id = monitor_fdset_dup_fd_find(fd);
>> >     if (fdset_id != -1) {
>> >         int ret;
>> >
>> >         ret = close(fd);
>> >         if (ret == 0) {
>> >             monitor_fdset_dup_fd_remove(fd);
>> >         }
>> >
>> >         return ret;
>> >     }
>> >
>> > Shouldn't this done the work already?
>> 
>> That removes the mon_fdset_fd_dup->fd, we want to remove the
>> mon_fdset_fd->fd.
>
> What I read so far is when we are removing the dup-fds, we'll do one more
> thing:
>
> monitor_fdset_dup_fd_find_remove():
>                     if (QLIST_EMPTY(&mon_fdset->dup_fds)) {
>                         monitor_fdset_cleanup(mon_fdset);
>                     }
>
> It means if we removed all the dup-fds correctly, we should also remove the
> whole fdset, which includes the ->fds, IIUC.
>

Since mon_fdset_fd->removed == false, we hit the runstate_is_running()
problem. I'm not sure, but probably mon_refcount > 0 as well. So the fd
would not be removed.

But I'll retest this on Monday just be sure, it's been a while since I
wrote some parts of this.

>> 
>> >
>> > Off topic: I think this code is over complicated too, maybe I missed
>> > something, but afaict we don't need monitor_fdset_dup_fd_find at all.. we
>> > simply walk the list and remove stuff..  I attach a patch at the end that I
>> > tried to clean that up, just in case there's early comments.  But we can
>> > ignore that so we don't get side-tracked, and focus on the direct-io
>> > issues.
>> 
>> Well, I'm not confident touching this code. This is more than a decade
>> old, I have no idea what the original motivations were. The possible
>> interactions with the user via command-line (-add-fd), QMP (add-fd) and
>> the monitor lifetime make me confused. Not to mention the fdset part
>> being plumbed into the guts of a widely used qemu_open_internal() that
>> very misleadingly presents itself as just a wrapper for open().
>
> If to make QEMU long live, we'll probably need to touch it at some
> point.. or at least discuss about it and figure things out. We pay tech
> debts like this when there's no good comment / docs to refer in this case,
> then the earlier, perhaps also the better.. to try taking the stab, imho.
>
> Definitely not a request to clean everything up. :) Let's see whether
> others can chim in with better knowledge of the history.
>
>> 
>> >
>> > Thanks,
>> >
>> > =======
>> >
>> > From 2f6b6d1224486d8ee830a7afe34738a07003b863 Mon Sep 17 00:00:00 2001
>> > From: Peter Xu <peterx@redhat.com>
>> > Date: Fri, 3 May 2024 11:27:20 -0400
>> > Subject: [PATCH] monitor: Drop monitor_fdset_dup_fd_add()
>> > MIME-Version: 1.0
>> > Content-Type: text/plain; charset=UTF-8
>> > Content-Transfer-Encoding: 8bit
>> >
>> > This function is not needed, one remove function should already work.
>> > Clean it up.
>> >
>> > Here the code doesn't really care about whether we need to keep that dupfd
>> > around if close() failed: when that happens something got very wrong,
>> > keeping the dup_fd around the fdsets may not help that situation so far.
>> >
>> > Cc: Dr. David Alan Gilbert <dave@treblig.org>
>> > Cc: Markus Armbruster <armbru@redhat.com>
>> > Cc: Philippe Mathieu-Daudé <philmd@linaro.org>
>> > Cc: Paolo Bonzini <pbonzini@redhat.com>
>> > Cc: Daniel P. Berrangé <berrange@redhat.com>
>> > Signed-off-by: Peter Xu <peterx@redhat.com>
>> > ---
>> >  include/monitor/monitor.h |  1 -
>> >  monitor/fds.c             | 27 +++++----------------------
>> >  stubs/fdset.c             |  5 -----
>> >  util/osdep.c              | 15 +--------------
>> >  4 files changed, 6 insertions(+), 42 deletions(-)
>> >
>> > diff --git a/include/monitor/monitor.h b/include/monitor/monitor.h
>> > index 965f5d5450..fd9b3f538c 100644
>> > --- a/include/monitor/monitor.h
>> > +++ b/include/monitor/monitor.h
>> > @@ -53,7 +53,6 @@ AddfdInfo *monitor_fdset_add_fd(int fd, bool has_fdset_id, int64_t fdset_id,
>> >                                  const char *opaque, Error **errp);
>> >  int monitor_fdset_dup_fd_add(int64_t fdset_id, int flags);
>> >  void monitor_fdset_dup_fd_remove(int dup_fd);
>> > -int64_t monitor_fdset_dup_fd_find(int dup_fd);
>> >  
>> >  void monitor_register_hmp(const char *name, bool info,
>> >                            void (*cmd)(Monitor *mon, const QDict *qdict));
>> > diff --git a/monitor/fds.c b/monitor/fds.c
>> > index d86c2c674c..d5aecfb70e 100644
>> > --- a/monitor/fds.c
>> > +++ b/monitor/fds.c
>> > @@ -458,7 +458,7 @@ int monitor_fdset_dup_fd_add(int64_t fdset_id, int flags)
>> >  #endif
>> >  }
>> >  
>> > -static int64_t monitor_fdset_dup_fd_find_remove(int dup_fd, bool remove)
>> > +void monitor_fdset_dup_fd_remove(int dup_fd)
>> >  {
>> >      MonFdset *mon_fdset;
>> >      MonFdsetFd *mon_fdset_fd_dup;
>> > @@ -467,31 +467,14 @@ static int64_t monitor_fdset_dup_fd_find_remove(int dup_fd, bool remove)
>> >      QLIST_FOREACH(mon_fdset, &mon_fdsets, next) {
>> >          QLIST_FOREACH(mon_fdset_fd_dup, &mon_fdset->dup_fds, next) {
>> >              if (mon_fdset_fd_dup->fd == dup_fd) {
>> > -                if (remove) {
>> > -                    QLIST_REMOVE(mon_fdset_fd_dup, next);
>> > -                    g_free(mon_fdset_fd_dup);
>> > -                    if (QLIST_EMPTY(&mon_fdset->dup_fds)) {
>> > -                        monitor_fdset_cleanup(mon_fdset);
>> > -                    }
>> > -                    return -1;
>> > -                } else {
>> > -                    return mon_fdset->id;
>> > +                QLIST_REMOVE(mon_fdset_fd_dup, next);
>> > +                g_free(mon_fdset_fd_dup);
>> > +                if (QLIST_EMPTY(&mon_fdset->dup_fds)) {
>> > +                    monitor_fdset_cleanup(mon_fdset);
>> >                  }
>> >              }
>> >          }
>> >      }
>> > -
>> > -    return -1;
>> > -}
>> > -
>> > -int64_t monitor_fdset_dup_fd_find(int dup_fd)
>> > -{
>> > -    return monitor_fdset_dup_fd_find_remove(dup_fd, false);
>> > -}
>> > -
>> > -void monitor_fdset_dup_fd_remove(int dup_fd)
>> > -{
>> > -    monitor_fdset_dup_fd_find_remove(dup_fd, true);
>> >  }
>> >  
>> >  int monitor_fd_param(Monitor *mon, const char *fdname, Error **errp)
>> > diff --git a/stubs/fdset.c b/stubs/fdset.c
>> > index d7c39a28ac..389e368a29 100644
>> > --- a/stubs/fdset.c
>> > +++ b/stubs/fdset.c
>> > @@ -9,11 +9,6 @@ int monitor_fdset_dup_fd_add(int64_t fdset_id, int flags)
>> >      return -1;
>> >  }
>> >  
>> > -int64_t monitor_fdset_dup_fd_find(int dup_fd)
>> > -{
>> > -    return -1;
>> > -}
>> > -
>> >  void monitor_fdset_dup_fd_remove(int dupfd)
>> >  {
>> >  }
>> > diff --git a/util/osdep.c b/util/osdep.c
>> > index e996c4744a..2d9749d060 100644
>> > --- a/util/osdep.c
>> > +++ b/util/osdep.c
>> > @@ -393,21 +393,8 @@ int qemu_open_old(const char *name, int flags, ...)
>> >  
>> >  int qemu_close(int fd)
>> >  {
>> > -    int64_t fdset_id;
>> > -
>> >      /* Close fd that was dup'd from an fdset */
>> > -    fdset_id = monitor_fdset_dup_fd_find(fd);
>> > -    if (fdset_id != -1) {
>> > -        int ret;
>> > -
>> > -        ret = close(fd);
>> > -        if (ret == 0) {
>> > -            monitor_fdset_dup_fd_remove(fd);
>> > -        }
>> > -
>> > -        return ret;
>> > -    }
>> > -
>> > +    monitor_fdset_dup_fd_remove(fd);
>> >      return close(fd);
>> >  }
>> >  
>> > -- 
>> > 2.44.0
>> 


  reply	other threads:[~2024-05-03 21:31 UTC|newest]

Thread overview: 57+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-04-26 14:20 [PATCH 0/9] migration/mapped-ram: Add direct-io support Fabiano Rosas
2024-04-26 14:20 ` [PATCH 1/9] monitor: Honor QMP request for fd removal immediately Fabiano Rosas
2024-05-03 16:02   ` Peter Xu
2024-05-16 21:46     ` Fabiano Rosas
2024-05-08  7:17   ` Daniel P. Berrangé
2024-05-16 22:00     ` Fabiano Rosas
2024-05-17  7:33       ` Daniel P. Berrangé
2024-04-26 14:20 ` [PATCH 2/9] migration: Fix file migration with fdset Fabiano Rosas
2024-05-03 16:23   ` Peter Xu
2024-05-03 19:56     ` Fabiano Rosas
2024-05-03 21:04       ` Peter Xu
2024-05-03 21:31         ` Fabiano Rosas [this message]
2024-05-03 21:56           ` Peter Xu
2024-05-08  8:02     ` Daniel P. Berrangé
2024-05-08 12:49       ` Peter Xu
2024-05-08  8:00   ` Daniel P. Berrangé
2024-05-08 20:45     ` Fabiano Rosas
2024-04-26 14:20 ` [PATCH 3/9] tests/qtest/migration: Fix file migration offset check Fabiano Rosas
2024-05-03 16:47   ` Peter Xu
2024-05-03 20:36     ` Fabiano Rosas
2024-05-03 21:08       ` Peter Xu
2024-05-08  8:10       ` Daniel P. Berrangé
2024-04-26 14:20 ` [PATCH 4/9] migration: Add direct-io parameter Fabiano Rosas
2024-04-26 14:33   ` Markus Armbruster
2024-05-03 18:05   ` Peter Xu
2024-05-03 20:49     ` Fabiano Rosas
2024-05-03 21:16       ` Peter Xu
2024-05-14 14:10         ` Markus Armbruster
2024-05-14 17:57           ` Fabiano Rosas
2024-05-15  7:17             ` Markus Armbruster
2024-05-15 12:51               ` Fabiano Rosas
2024-05-08  8:25   ` Daniel P. Berrangé
2024-04-26 14:20 ` [PATCH 5/9] migration/multifd: Add direct-io support Fabiano Rosas
2024-05-03 18:29   ` Peter Xu
2024-05-03 20:54     ` Fabiano Rosas
2024-05-03 21:18       ` Peter Xu
2024-05-08  8:27   ` Daniel P. Berrangé
2024-04-26 14:20 ` [PATCH 6/9] tests/qtest/migration: Add tests for file migration with direct-io Fabiano Rosas
2024-05-03 18:38   ` Peter Xu
2024-05-03 21:05     ` Fabiano Rosas
2024-05-03 21:25       ` Peter Xu
2024-05-08  8:34   ` Daniel P. Berrangé
2024-04-26 14:20 ` [PATCH 7/9] monitor: fdset: Match against O_DIRECT Fabiano Rosas
2024-05-03 18:53   ` Peter Xu
2024-05-03 21:19     ` Fabiano Rosas
2024-05-03 22:16       ` Peter Xu
2024-04-26 14:20 ` [PATCH 8/9] migration: Add support for fdset with multifd + file Fabiano Rosas
2024-05-08  8:53   ` Daniel P. Berrangé
2024-05-08 18:23     ` Peter Xu
2024-05-08 20:39       ` Fabiano Rosas
2024-05-09  8:08         ` Daniel P. Berrangé
2024-05-17 22:43           ` Fabiano Rosas
2024-05-18  8:36             ` Daniel P. Berrangé
2024-04-26 14:20 ` [PATCH 9/9] tests/qtest/migration: Add a test for mapped-ram with passing of fds Fabiano Rosas
2024-05-08  8:56   ` Daniel P. Berrangé
2024-05-02 20:01 ` [PATCH 0/9] migration/mapped-ram: Add direct-io support Peter Xu
2024-05-02 20:34   ` Fabiano Rosas

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=87plu2mvl1.fsf@suse.de \
    --to=farosas@suse.de \
    --cc=armbru@redhat.com \
    --cc=berrange@redhat.com \
    --cc=cfontana@suse.de \
    --cc=jfehlig@suse.com \
    --cc=peterx@redhat.com \
    --cc=qemu-devel@nongnu.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).