From: Peter Xu <peterx@redhat.com>
To: Steven Sistare <steven.sistare@oracle.com>
Cc: qemu-devel@nongnu.org, Fabiano Rosas <farosas@suse.de>,
David Hildenbrand <david@redhat.com>,
Marcel Apfelbaum <marcel.apfelbaum@gmail.com>,
Eduardo Habkost <eduardo@habkost.net>,
Philippe Mathieu-Daude <philmd@linaro.org>,
Paolo Bonzini <pbonzini@redhat.com>,
"Daniel P. Berrange" <berrange@redhat.com>,
Markus Armbruster <armbru@redhat.com>
Subject: Re: [PATCH V4 06/19] physmem: preserve ram blocks for cpr
Date: Thu, 12 Dec 2024 17:48:03 -0500 [thread overview]
Message-ID: <Z1toIxDzI56ODYcC@x1n> (raw)
In-Reply-To: <bbb7b4a9-6078-4cb1-89c9-ec2d57b996f0@oracle.com>
On Thu, Dec 12, 2024 at 03:38:14PM -0500, Steven Sistare wrote:
> On 12/9/2024 3:07 PM, Peter Xu wrote:
> > On Mon, Dec 02, 2024 at 05:19:58AM -0800, Steve Sistare wrote:
> > > Save the memfd for ramblocks in CPR state, along with a name that
> > > uniquely identifies it. The block's idstr is not yet set, so it
> > > cannot be used for this purpose. Find the saved memfd in new QEMU when
> > > creating a block. If the block size is larger in new QEMU, extend the
> > > block using fallocate, and the extra space will be useable after a guest
> > > reset.
> > >
> > > Signed-off-by: Steve Sistare <steven.sistare@oracle.com>
> > > ---
> > > system/physmem.c | 36 ++++++++++++++++++++++++++++++++++--
> > > 1 file changed, 34 insertions(+), 2 deletions(-)
> > >
> > > diff --git a/system/physmem.c b/system/physmem.c
> > > index 0bcb2cc..aa095a3 100644
> > > --- a/system/physmem.c
> > > +++ b/system/physmem.c
> > > @@ -70,6 +70,7 @@
> > > #include "qemu/pmem.h"
> > > +#include "migration/cpr.h"
> > > #include "migration/vmstate.h"
> > > #include "qemu/range.h"
> > > @@ -1661,6 +1662,19 @@ void qemu_ram_unset_idstr(RAMBlock *block)
> > > }
> > > }
> > > +static char *cpr_name(RAMBlock *block)
> > > +{
> > > + MemoryRegion *mr = block->mr;
> > > + const char *mr_name = memory_region_name(mr);
> > > + g_autofree char *id = mr->dev ? qdev_get_dev_path(mr->dev) : NULL;
> > > +
> > > + if (id) {
> > > + return g_strdup_printf("%s/%s", id, mr_name);
> > > + } else {
> > > + return g_strdup(mr_name);
> > > + }
> > > +}
> > > +
> > > size_t qemu_ram_pagesize(RAMBlock *rb)
> > > {
> > > return rb->page_size;
> > > @@ -2080,8 +2094,18 @@ static bool qemu_ram_alloc_shared(RAMBlock *new_block, Error **errp)
> > > {
> > > size_t max_length = new_block->max_length;
> > > MemoryRegion *mr = new_block->mr;
> > > - const char *name = memory_region_name(mr);
> > > - int fd;
> > > + g_autofree char *name = cpr_name(new_block);
> > > + int fd = cpr_find_fd(name, 0);
> >
> > If to use the proposed patch in the reply of patch 2, here this should be
> > able to be moved to qemu_ram_alloc_anonymous_fd(), IIUC.
> >
> > > +
> > > + if (fd >= 0) {
> > > + if (lseek(fd, 0, SEEK_END) < max_length && ftruncate(fd, max_length)) {
> > > + error_setg_errno(errp, errno,
> > > + "cannot grow ram block %s fd %d to %ld bytes",
> > > + name, fd, max_length);
> > > + goto err;
> > > + }
> >
> > I remember we discussed something similar to this, do we need ftruncate()
> > at all? I think not.
> >
> > This happens when booting QEMU, so I don't think it's relevant yet to what
> > size used in src, as this is dest.
> >
> > It starts to get relevant only when cpr migration starts on src, it sents
> > ramblocks at the beginning, then parse_ramblock() will properly resize any
> > ramblock to whatever size it should use.
> >
> > If the resize didn't happen it can only mean that used_length is correctly
> > matched on both sides.
> >
> > So I don't see why a special truncate() call is needed yet..
>
> You suggested truncate:
>
> https://lore.kernel.org/qemu-devel/47d6d984-7002-4086-bb10-b191168f141f@oracle.com/
>
> "So after such system reset, QEMU might start to see new ROM code loaded
> here (not the one that got migrated anymore, which will only match the
> version installed on src QEMU). Here the problem is the new firmware can
> be larger, so I _think_ we need to make sure max_length is not modified by
> CPR to allow resizing happen here, while if we use truncate=true here it
> should just work in all cases."
>
> ... but you suggested passing a truncate bool to the file_ram_alloc call after
> cpr_find_fd. I could do that instead. However, if qemu_ram_alloc_shared uses
> qemu_ram_alloc_from_fd instead of file_ram_alloc, per your suggestion in patch 2,
> then I will still call ftruncate here, because qemu_ram_alloc_from_fd does not
> take a truncate argument.
My memory was when reuse qemu_ram_alloc_from_fd() in that suggestion of
patch 2, it will only create zero-length fd (with fsize=0) and leave all
the rest to qemu_ram_alloc_from_fd(), then with that:
qemu_ram_alloc_from_fd:
new_block->host = file_ram_alloc(new_block, size, fd, !file_size, offset,
errp);
So that'll always have truncate==!file_size==1. Then truncate will be done
at file_ram_alloc() later, iiuc.
if (truncate && ftruncate(fd, offset + memory)) {
perror("ftruncate");
}
Would this work?
Meanwhile, this whole ram resize discussion reminded me that to reuse
qemu_ram_alloc_from_fd(), we may also want to make sure to pass ->resized()
hook from qemu_ram_alloc_internal() to the fd helper too.. IOW, we may
want to keep qemu_ram_resize() invoke those hooks, even after switching to
fd-based for aux mems.
Maybe the size / max_size also need to be passed over? As for fd ramblock
it used to be always the same on used_length/max_length, but not anymore
when we switch aux mem to fd based. Please feel free to double check..
--
Peter Xu
next prev parent reply other threads:[~2024-12-12 22:49 UTC|newest]
Thread overview: 78+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-12-02 13:19 [PATCH V4 00/19] Live update: cpr-transfer Steve Sistare
2024-12-02 13:19 ` [PATCH V4 01/19] backends/hostmem-shm: factor out allocation of "anonymous shared memory with an fd" Steve Sistare
2024-12-09 17:36 ` Peter Xu
2024-12-12 20:37 ` Steven Sistare
2024-12-02 13:19 ` [PATCH V4 02/19] physmem: fd-based shared memory Steve Sistare
2024-12-09 19:42 ` Peter Xu
2024-12-12 20:38 ` Steven Sistare
2024-12-12 21:22 ` Peter Xu
2024-12-13 16:41 ` Steven Sistare
2024-12-13 17:05 ` Steven Sistare
2024-12-16 18:19 ` Peter Xu
2024-12-17 21:54 ` Steven Sistare
2024-12-17 22:46 ` Peter Xu
2024-12-18 16:34 ` Steven Sistare
2024-12-02 13:19 ` [PATCH V4 03/19] memory: add RAM_PRIVATE Steve Sistare
2024-12-09 19:45 ` Peter Xu
2024-12-02 13:19 ` [PATCH V4 04/19] machine: aux-ram-share option Steve Sistare
2024-12-05 8:25 ` Markus Armbruster
2024-12-05 14:24 ` Steven Sistare
2024-12-05 12:08 ` Markus Armbruster
2024-12-05 12:19 ` Markus Armbruster
2024-12-05 14:24 ` Steven Sistare
2024-12-09 19:54 ` Peter Xu
2024-12-12 20:38 ` Steven Sistare
2024-12-12 21:22 ` Peter Xu
2024-12-02 13:19 ` [PATCH V4 05/19] migration: cpr-state Steve Sistare
2024-12-02 13:19 ` [PATCH V4 06/19] physmem: preserve ram blocks for cpr Steve Sistare
2024-12-09 20:07 ` Peter Xu
2024-12-12 20:38 ` Steven Sistare
2024-12-12 22:48 ` Peter Xu [this message]
2024-12-13 15:21 ` Peter Xu
2024-12-13 15:30 ` Steven Sistare
2024-12-18 16:34 ` Steven Sistare
2024-12-18 17:00 ` Peter Xu
2024-12-18 20:22 ` Steven Sistare
2024-12-18 20:33 ` Peter Xu
2024-12-02 13:19 ` [PATCH V4 07/19] hostmem-memfd: preserve " Steve Sistare
2024-12-18 19:53 ` Steven Sistare
2024-12-18 20:23 ` Peter Xu
2024-12-02 13:20 ` [PATCH V4 08/19] hostmem-shm: " Steve Sistare
2024-12-12 17:38 ` Peter Xu
2024-12-02 13:20 ` [PATCH V4 09/19] migration: incoming channel Steve Sistare
2024-12-05 15:23 ` Markus Armbruster
2024-12-05 20:45 ` Steven Sistare
2024-12-09 12:12 ` Markus Armbruster
2024-12-09 16:36 ` Peter Xu
2024-12-11 9:18 ` Markus Armbruster
2024-12-11 18:58 ` Steven Sistare
2024-12-10 12:46 ` Markus Armbruster
2024-12-02 13:20 ` [PATCH V4 10/19] migration: cpr channel Steve Sistare
2024-12-05 15:37 ` Markus Armbruster
2024-12-05 20:46 ` Steven Sistare
2024-12-06 9:31 ` Markus Armbruster
2024-12-18 19:53 ` Steven Sistare
2024-12-18 20:27 ` Peter Xu
2024-12-18 20:31 ` Steven Sistare
2024-12-02 13:20 ` [PATCH V4 11/19] migration: SCM_RIGHTS for QEMUFile Steve Sistare
2024-12-02 13:20 ` [PATCH V4 12/19] migration: VMSTATE_FD Steve Sistare
2024-12-02 13:20 ` [PATCH V4 13/19] migration: cpr-transfer save and load Steve Sistare
2024-12-02 13:20 ` [PATCH V4 14/19] migration: cpr-transfer mode Steve Sistare
2024-12-04 16:10 ` Steven Sistare
2024-12-10 12:26 ` Markus Armbruster
2024-12-11 22:05 ` Steven Sistare
2024-12-02 13:20 ` [PATCH V4 15/19] tests/migration-test: memory_backend Steve Sistare
2024-12-02 13:20 ` [PATCH V4 16/19] tests/qtest: defer connection Steve Sistare
2024-12-18 21:02 ` Steven Sistare
2024-12-19 15:46 ` Peter Xu
2024-12-19 22:33 ` Steven Sistare
2024-12-02 13:20 ` [PATCH V4 17/19] tests/migration-test: " Steve Sistare
2024-12-02 13:20 ` [PATCH V4 18/19] migration-test: cpr-transfer Steve Sistare
2024-12-18 21:03 ` Steven Sistare
2024-12-19 16:56 ` Peter Xu
2024-12-19 22:34 ` Steven Sistare
2024-12-20 15:41 ` Peter Xu
2024-12-02 13:20 ` [PATCH V4 19/19] migration: cpr-transfer documentation Steve Sistare
2024-12-18 21:03 ` Steven Sistare
2024-12-19 17:02 ` Peter Xu
2024-12-19 22:35 ` Steven Sistare
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=Z1toIxDzI56ODYcC@x1n \
--to=peterx@redhat.com \
--cc=armbru@redhat.com \
--cc=berrange@redhat.com \
--cc=david@redhat.com \
--cc=eduardo@habkost.net \
--cc=farosas@suse.de \
--cc=marcel.apfelbaum@gmail.com \
--cc=pbonzini@redhat.com \
--cc=philmd@linaro.org \
--cc=qemu-devel@nongnu.org \
--cc=steven.sistare@oracle.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.