[PATCH 0/3] migration: Add support for mapped-ram with snapshots

qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed

* [PATCH 0/3] migration: Add support for mapped-ram with snapshots
@ 2025-10-01 16:18 Marco Cavenati
  2025-10-01 16:18 ` [PATCH 1/3] migration: add FEATURE_SEEKABLE to QIOChannelBlock Marco Cavenati
                   ` (2 more replies)
  0 siblings, 3 replies; 10+ messages in thread
From: Marco Cavenati @ 2025-10-01 16:18 UTC (permalink / raw)
  To: qemu-devel; +Cc: peterx, farosas, ppandit, berrange, Marco Cavenati

Hello,

Add support for mapped-ram capability in combination with savevm/loadvm
snapshots.
Explicitly zero out memory pages since they are not guaranteed to be zero
(as they are for migration).
Add the flag and plumbings to QIOChannelBlock to support non-sequential
operations.

Part of this series was already submitted here https://lore.kernel.org/all/20250327141451.163744-3-Marco.Cavenati@eurecom.fr/
Thanks for all the feedbacks and guidance received there.

Marco Cavenati (3):
  migration: add FEATURE_SEEKABLE to QIOChannelBlock
  migration/ram: fix docs of ram_handle_zero
  migration: mapped-ram: handle zero pages

 migration/channel-block.c | 48 +++++++++++++++++++++++++++++++++
 migration/ram.c           | 57 +++++++++++++++++++++++++++++++++++++--
 2 files changed, 103 insertions(+), 2 deletions(-)

-- 
2.48.1

^ permalink raw reply	[flat|nested] 10+ messages in thread

* [PATCH 1/3] migration: add FEATURE_SEEKABLE to QIOChannelBlock
  2025-10-01 16:18 [PATCH 0/3] migration: Add support for mapped-ram with snapshots Marco Cavenati
@ 2025-10-01 16:18 ` Marco Cavenati
  2025-10-02 14:45   ` Juraj Marcin
  2025-10-01 16:18 ` [PATCH 2/3] migration/ram: fix docs of ram_handle_zero Marco Cavenati
  2025-10-01 16:18 ` [PATCH 3/3] migration: mapped-ram: handle zero pages Marco Cavenati
  2 siblings, 1 reply; 10+ messages in thread
From: Marco Cavenati @ 2025-10-01 16:18 UTC (permalink / raw)
  To: qemu-devel; +Cc: peterx, farosas, ppandit, berrange, Marco Cavenati

Enable the use of the mapped-ram migration feature with savevm/loadvm
snapshots by adding the QIO_CHANNEL_FEATURE_SEEKABLE feature to
QIOChannelBlock. Implement io_preadv and io_pwritev methods to provide
positioned I/O capabilities that don't modify the channel's position
pointer.

Signed-off-by: Marco Cavenati <Marco.Cavenati@eurecom.fr>
---
 migration/channel-block.c | 48 +++++++++++++++++++++++++++++++++++++++
 1 file changed, 48 insertions(+)

diff --git a/migration/channel-block.c b/migration/channel-block.c
index 97de5a691b..30dcefcd8e 100644
--- a/migration/channel-block.c
+++ b/migration/channel-block.c
@@ -30,6 +30,7 @@ qio_channel_block_new(BlockDriverState *bs)
     QIOChannelBlock *ioc;
 
     ioc = QIO_CHANNEL_BLOCK(object_new(TYPE_QIO_CHANNEL_BLOCK));
+    qio_channel_set_feature(QIO_CHANNEL(ioc), QIO_CHANNEL_FEATURE_SEEKABLE);
 
     bdrv_ref(bs);
     ioc->bs = bs;
@@ -96,6 +97,49 @@ qio_channel_block_writev(QIOChannel *ioc,
     return qiov.size;
 }
 
+#ifdef CONFIG_PREADV
+static ssize_t
+qio_channel_block_preadv(QIOChannel *ioc,
+                         const struct iovec *iov,
+                         size_t niov,
+                         off_t offset,
+                         Error **errp)
+{
+    QIOChannelBlock *bioc = QIO_CHANNEL_BLOCK(ioc);
+    QEMUIOVector qiov;
+    int ret;
+
+    qemu_iovec_init_external(&qiov, (struct iovec *)iov, niov);
+    ret = bdrv_readv_vmstate(bioc->bs, &qiov, offset);
+    if (ret < 0) {
+        error_setg_errno(errp, -ret, "bdrv_readv_vmstate failed");
+        return -1;
+    }
+
+    return qiov.size;
+}
+
+static ssize_t
+qio_channel_block_pwritev(QIOChannel *ioc,
+                          const struct iovec *iov,
+                          size_t niov,
+                          off_t offset,
+                          Error **errp)
+{
+    QIOChannelBlock *bioc = QIO_CHANNEL_BLOCK(ioc);
+    QEMUIOVector qiov;
+    int ret;
+
+    qemu_iovec_init_external(&qiov, (struct iovec *)iov, niov);
+    ret = bdrv_writev_vmstate(bioc->bs, &qiov, offset);
+    if (ret < 0) {
+        error_setg_errno(errp, -ret, "bdrv_writev_vmstate failed");
+        return -1;
+    }
+
+    return qiov.size;
+}
+#endif /* CONFIG_PREADV */
 
 static int
 qio_channel_block_set_blocking(QIOChannel *ioc,
@@ -177,6 +221,10 @@ qio_channel_block_class_init(ObjectClass *klass,
     ioc_klass->io_writev = qio_channel_block_writev;
     ioc_klass->io_readv = qio_channel_block_readv;
     ioc_klass->io_set_blocking = qio_channel_block_set_blocking;
+#ifdef CONFIG_PREADV
+    ioc_klass->io_preadv = qio_channel_block_preadv;
+    ioc_klass->io_pwritev = qio_channel_block_pwritev;
+#endif
     ioc_klass->io_seek = qio_channel_block_seek;
     ioc_klass->io_close = qio_channel_block_close;
     ioc_klass->io_set_aio_fd_handler = qio_channel_block_set_aio_fd_handler;
-- 
2.48.1



^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [PATCH 2/3] migration/ram: fix docs of ram_handle_zero
  2025-10-01 16:18 [PATCH 0/3] migration: Add support for mapped-ram with snapshots Marco Cavenati
  2025-10-01 16:18 ` [PATCH 1/3] migration: add FEATURE_SEEKABLE to QIOChannelBlock Marco Cavenati
@ 2025-10-01 16:18 ` Marco Cavenati
  2025-10-02 14:48   ` Juraj Marcin
  2025-10-01 16:18 ` [PATCH 3/3] migration: mapped-ram: handle zero pages Marco Cavenati
  2 siblings, 1 reply; 10+ messages in thread
From: Marco Cavenati @ 2025-10-01 16:18 UTC (permalink / raw)
  To: qemu-devel; +Cc: peterx, farosas, ppandit, berrange, Marco Cavenati

Remove outdated 'ch' parameter from the function documentation.

Signed-off-by: Marco Cavenati <Marco.Cavenati@eurecom.fr>
---
 migration/ram.c | 1 -
 1 file changed, 1 deletion(-)

diff --git a/migration/ram.c b/migration/ram.c
index 7208bc114f..e238c9233f 100644
--- a/migration/ram.c
+++ b/migration/ram.c
@@ -3552,7 +3552,6 @@ static inline void *colo_cache_from_block_offset(RAMBlock *block,
  * determined to be zero, then zap it.
  *
  * @host: host address for the zero page
- * @ch: what the page is filled from.  We only support zero
  * @size: size of the zero page
  */
 void ram_handle_zero(void *host, uint64_t size)
-- 
2.48.1



^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [PATCH 3/3] migration: mapped-ram: handle zero pages
  2025-10-01 16:18 [PATCH 0/3] migration: Add support for mapped-ram with snapshots Marco Cavenati
  2025-10-01 16:18 ` [PATCH 1/3] migration: add FEATURE_SEEKABLE to QIOChannelBlock Marco Cavenati
  2025-10-01 16:18 ` [PATCH 2/3] migration/ram: fix docs of ram_handle_zero Marco Cavenati
@ 2025-10-01 16:18 ` Marco Cavenati
  2025-10-02  8:49   ` Marco Cavenati
  2 siblings, 1 reply; 10+ messages in thread
From: Marco Cavenati @ 2025-10-01 16:18 UTC (permalink / raw)
  To: qemu-devel; +Cc: peterx, farosas, ppandit, berrange, Marco Cavenati

Make mapped-ram compatible with loadvm snapshot restoring by explicitly
zeroing memory pages in this case.
Skip zeroing for -incoming and -loadvm migrations to preserve performance.

Signed-off-by: Marco Cavenati <Marco.Cavenati@eurecom.fr>
---
 migration/ram.c | 56 ++++++++++++++++++++++++++++++++++++++++++++++++-
 1 file changed, 55 insertions(+), 1 deletion(-)

diff --git a/migration/ram.c b/migration/ram.c
index e238c9233f..597d5ffe9e 100644
--- a/migration/ram.c
+++ b/migration/ram.c
@@ -3958,12 +3958,55 @@ static size_t ram_load_multifd_pages(void *host_addr, size_t size,
     return size;
 }
 
+/**
+ * handle_zero_mapped_ram: Zero out a range of RAM pages if required during
+ * mapped-ram load
+ *
+ * Zeroing is only performed when restoring from a snapshot (HMP loadvm).
+ * During incoming migration or -loadvm cli snapshot load, the function is a
+ * no-op and returns true as in those cases the pages are already guaranteed to
+ * be zeroed.
+ *
+ * Returns: true on success, false on error (with @errp set).
+ * @from_bit_idx: Starting index relative to the map of the page (inclusive)
+ * @to_bit_idx:   Ending index relative to the map of the page (exclusive)
+ */
+static bool handle_zero_mapped_ram(RAMBlock *block, unsigned long from_bit_idx,
+                                   unsigned long to_bit_idx, Error **errp)
+{
+    ERRP_GUARD();
+    ram_addr_t offset;
+    size_t size;
+    void *host;
+
+    if (runstate_check(RUN_STATE_INMIGRATE) ||
+        runstate_check(RUN_STATE_PRELAUNCH)) {
+        return true;
+    }
+
+    if (from_bit_idx == to_bit_idx) {
+        return true;
+    }
+
+    size = TARGET_PAGE_SIZE * (to_bit_idx - from_bit_idx);
+    offset = from_bit_idx << TARGET_PAGE_BITS;
+    host = host_from_ram_block_offset(block, offset);
+    if (!host) {
+        error_setg(errp, "zero page outside of ramblock %s range",
+                   block->idstr);
+        return false;
+    }
+    ram_handle_zero(host, size);
+
+    return true;
+}
+
 static bool read_ramblock_mapped_ram(QEMUFile *f, RAMBlock *block,
                                      long num_pages, unsigned long *bitmap,
                                      Error **errp)
 {
     ERRP_GUARD();
-    unsigned long set_bit_idx, clear_bit_idx;
+    unsigned long set_bit_idx, clear_bit_idx = 0;
     ram_addr_t offset;
     void *host;
     size_t read, unread, size;
@@ -3972,6 +4015,12 @@ static bool read_ramblock_mapped_ram(QEMUFile *f, RAMBlock *block,
          set_bit_idx < num_pages;
          set_bit_idx = find_next_bit(bitmap, num_pages, clear_bit_idx + 1)) {
 
+        /* Zero pages */
+        if (!handle_zero_mapped_ram(block, set_bit_idx, clear_bit_idx, errp)) {
+            return false;
+        }
+
+        /* Non-zero pages */
         clear_bit_idx = find_next_zero_bit(bitmap, num_pages, set_bit_idx + 1);
 
         unread = TARGET_PAGE_SIZE * (clear_bit_idx - set_bit_idx);
@@ -4003,6 +4052,11 @@ static bool read_ramblock_mapped_ram(QEMUFile *f, RAMBlock *block,
         }
     }
 
+    /* Handle trailing 0 pages */
+    if (!handle_zero_mapped_ram(block, num_pages, clear_bit_idx, errp)) {
+        return false;
+    }
+
     return true;
 
 err:
-- 
2.48.1



^ permalink raw reply related	[flat|nested] 10+ messages in thread

* Re: [PATCH 3/3] migration: mapped-ram: handle zero pages
  2025-10-01 16:18 ` [PATCH 3/3] migration: mapped-ram: handle zero pages Marco Cavenati
@ 2025-10-02  8:49   ` Marco Cavenati
  2025-10-07 21:11     ` Peter Xu
  0 siblings, 1 reply; 10+ messages in thread
From: Marco Cavenati @ 2025-10-02  8:49 UTC (permalink / raw)
  To: Marco Cavenati; +Cc: qemu-devel, peterx, farosas, ppandit, berrange

Please note that there are a couple of errors (swapped parameters), which are
detailed below.
I will address these in the next iteration, along with any additional changes
based on your feedback.

Thank you
Marco

On Wednesday, October 01, 2025 18:18 CEST, Marco Cavenati <Marco.Cavenati@eurecom.fr> wrote:

> Make mapped-ram compatible with loadvm snapshot restoring by explicitly
> zeroing memory pages in this case.
> Skip zeroing for -incoming and -loadvm migrations to preserve performance.
> 
> Signed-off-by: Marco Cavenati <Marco.Cavenati@eurecom.fr>
> ---
>  migration/ram.c | 56 ++++++++++++++++++++++++++++++++++++++++++++++++-
>  1 file changed, 55 insertions(+), 1 deletion(-)
> 
> diff --git a/migration/ram.c b/migration/ram.c
> index e238c9233f..597d5ffe9e 100644
> --- a/migration/ram.c
> +++ b/migration/ram.c
> @@ -3958,12 +3958,55 @@ static size_t ram_load_multifd_pages(void *host_addr, size_t size,
>      return size;
>  }
>  
> +/**
> + * handle_zero_mapped_ram: Zero out a range of RAM pages if required during
> + * mapped-ram load
> + *
> + * Zeroing is only performed when restoring from a snapshot (HMP loadvm).
> + * During incoming migration or -loadvm cli snapshot load, the function is a
> + * no-op and returns true as in those cases the pages are already guaranteed to
> + * be zeroed.
> + *
> + * Returns: true on success, false on error (with @errp set).
> + * @from_bit_idx: Starting index relative to the map of the page (inclusive)
> + * @to_bit_idx:   Ending index relative to the map of the page (exclusive)
> + */
> +static bool handle_zero_mapped_ram(RAMBlock *block, unsigned long from_bit_idx,
> +                                   unsigned long to_bit_idx, Error **errp)
> +{
> +    ERRP_GUARD();
> +    ram_addr_t offset;
> +    size_t size;
> +    void *host;
> +
> +    if (runstate_check(RUN_STATE_INMIGRATE) ||
> +        runstate_check(RUN_STATE_PRELAUNCH)) {
> +        return true;
> +    }
> +
> +    if (from_bit_idx == to_bit_idx) {
> +        return true;
> +    }
> +
> +    size = TARGET_PAGE_SIZE * (to_bit_idx - from_bit_idx);
> +    offset = from_bit_idx << TARGET_PAGE_BITS;
> +    host = host_from_ram_block_offset(block, offset);
> +    if (!host) {
> +        error_setg(errp, "zero page outside of ramblock %s range",
> +                   block->idstr);
> +        return false;
> +    }
> +    ram_handle_zero(host, size);
> +
> +    return true;
> +}
> +
>  static bool read_ramblock_mapped_ram(QEMUFile *f, RAMBlock *block,
>                                       long num_pages, unsigned long *bitmap,
>                                       Error **errp)
>  {
>      ERRP_GUARD();
> -    unsigned long set_bit_idx, clear_bit_idx;
> +    unsigned long set_bit_idx, clear_bit_idx = 0;
>      ram_addr_t offset;
>      void *host;
>      size_t read, unread, size;
> @@ -3972,6 +4015,12 @@ static bool read_ramblock_mapped_ram(QEMUFile *f, RAMBlock *block,
>           set_bit_idx < num_pages;
>           set_bit_idx = find_next_bit(bitmap, num_pages, clear_bit_idx + 1)) {
>  
> +        /* Zero pages */
> +        if (!handle_zero_mapped_ram(block, set_bit_idx, clear_bit_idx, errp)) {

This should be
+         if (!handle_zero_mapped_ram(block, clear_bit_idx, set_bit_idx, errp)) {

> +            return false;
> +        }
> +
> +        /* Non-zero pages */
>          clear_bit_idx = find_next_zero_bit(bitmap, num_pages, set_bit_idx + 1);
>  
>          unread = TARGET_PAGE_SIZE * (clear_bit_idx - set_bit_idx);
> @@ -4003,6 +4052,11 @@ static bool read_ramblock_mapped_ram(QEMUFile *f, RAMBlock *block,
>          }
>      }
>  
> +    /* Handle trailing 0 pages */
> +    if (!handle_zero_mapped_ram(block, num_pages, clear_bit_idx, errp)) {

This should be
+    if (!handle_zero_mapped_ram(block, clear_bit_idx, num_pages, errp)) {

> +        return false;
> +    }
> +
>      return true;
>  
>  err:
> -- 
> 2.48.1
>



^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH 1/3] migration: add FEATURE_SEEKABLE to QIOChannelBlock
  2025-10-01 16:18 ` [PATCH 1/3] migration: add FEATURE_SEEKABLE to QIOChannelBlock Marco Cavenati
@ 2025-10-02 14:45   ` Juraj Marcin
  2025-10-03  9:47     ` Marco Cavenati
  0 siblings, 1 reply; 10+ messages in thread
From: Juraj Marcin @ 2025-10-02 14:45 UTC (permalink / raw)
  To: Marco Cavenati; +Cc: qemu-devel, peterx, farosas, ppandit, berrange

Hi Marco,

On 2025-10-01 18:18, Marco Cavenati wrote:
> Enable the use of the mapped-ram migration feature with savevm/loadvm
> snapshots by adding the QIO_CHANNEL_FEATURE_SEEKABLE feature to
> QIOChannelBlock. Implement io_preadv and io_pwritev methods to provide
> positioned I/O capabilities that don't modify the channel's position
> pointer.
> 
> Signed-off-by: Marco Cavenati <Marco.Cavenati@eurecom.fr>
> ---
>  migration/channel-block.c | 48 +++++++++++++++++++++++++++++++++++++++
>  1 file changed, 48 insertions(+)
> 
> diff --git a/migration/channel-block.c b/migration/channel-block.c
> index 97de5a691b..30dcefcd8e 100644
> --- a/migration/channel-block.c
> +++ b/migration/channel-block.c
> @@ -30,6 +30,7 @@ qio_channel_block_new(BlockDriverState *bs)
>      QIOChannelBlock *ioc;
>  
>      ioc = QIO_CHANNEL_BLOCK(object_new(TYPE_QIO_CHANNEL_BLOCK));
> +    qio_channel_set_feature(QIO_CHANNEL(ioc), QIO_CHANNEL_FEATURE_SEEKABLE);
>  
>      bdrv_ref(bs);
>      ioc->bs = bs;
> @@ -96,6 +97,49 @@ qio_channel_block_writev(QIOChannel *ioc,
>      return qiov.size;
>  }
>  
> +#ifdef CONFIG_PREADV

I don't think this conditional macro is necessary here. QIOChannelFile
needs it because it directly uses preadv() syscall which might not be
available on all systems (see when CONFIG_PREADV is defined below).

    config_host_data.set('CONFIG_PREADV', cc.has_function('preadv', prefix: '#include <sys/uio.h>'))

However, bdrv_readv_vmstate() (wrapper of bdrv_co_readv_vmstate())
should be always available IUUC (it is not conditionally compiled based
on if preadv() syscall is available).

> +static ssize_t
> +qio_channel_block_preadv(QIOChannel *ioc,
> +                         const struct iovec *iov,
> +                         size_t niov,
> +                         off_t offset,
> +                         Error **errp)
> +{
> +    QIOChannelBlock *bioc = QIO_CHANNEL_BLOCK(ioc);
> +    QEMUIOVector qiov;
> +    int ret;
> +
> +    qemu_iovec_init_external(&qiov, (struct iovec *)iov, niov);
> +    ret = bdrv_readv_vmstate(bioc->bs, &qiov, offset);
> +    if (ret < 0) {
> +        error_setg_errno(errp, -ret, "bdrv_readv_vmstate failed");
> +        return -1;
> +    }
> +
> +    return qiov.size;
> +}
> +
> +static ssize_t
> +qio_channel_block_pwritev(QIOChannel *ioc,
> +                          const struct iovec *iov,
> +                          size_t niov,
> +                          off_t offset,
> +                          Error **errp)
> +{
> +    QIOChannelBlock *bioc = QIO_CHANNEL_BLOCK(ioc);
> +    QEMUIOVector qiov;
> +    int ret;
> +
> +    qemu_iovec_init_external(&qiov, (struct iovec *)iov, niov);
> +    ret = bdrv_writev_vmstate(bioc->bs, &qiov, offset);
> +    if (ret < 0) {
> +        error_setg_errno(errp, -ret, "bdrv_writev_vmstate failed");
> +        return -1;
> +    }
> +
> +    return qiov.size;
> +}
> +#endif /* CONFIG_PREADV */
>  
>  static int
>  qio_channel_block_set_blocking(QIOChannel *ioc,
> @@ -177,6 +221,10 @@ qio_channel_block_class_init(ObjectClass *klass,
>      ioc_klass->io_writev = qio_channel_block_writev;
>      ioc_klass->io_readv = qio_channel_block_readv;
>      ioc_klass->io_set_blocking = qio_channel_block_set_blocking;
> +#ifdef CONFIG_PREADV
> +    ioc_klass->io_preadv = qio_channel_block_preadv;
> +    ioc_klass->io_pwritev = qio_channel_block_pwritev;
> +#endif
>      ioc_klass->io_seek = qio_channel_block_seek;
>      ioc_klass->io_close = qio_channel_block_close;
>      ioc_klass->io_set_aio_fd_handler = qio_channel_block_set_aio_fd_handler;
> -- 
> 2.48.1
> 

Best regards,

Juraj Marcin



^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH 2/3] migration/ram: fix docs of ram_handle_zero
  2025-10-01 16:18 ` [PATCH 2/3] migration/ram: fix docs of ram_handle_zero Marco Cavenati
@ 2025-10-02 14:48   ` Juraj Marcin
  0 siblings, 0 replies; 10+ messages in thread
From: Juraj Marcin @ 2025-10-02 14:48 UTC (permalink / raw)
  To: Marco Cavenati; +Cc: qemu-devel, peterx, farosas, ppandit, berrange

On 2025-10-01 18:18, Marco Cavenati wrote:
> Remove outdated 'ch' parameter from the function documentation.
> 
> Signed-off-by: Marco Cavenati <Marco.Cavenati@eurecom.fr>
> ---
>  migration/ram.c | 1 -
>  1 file changed, 1 deletion(-)

Reviewed-by: Juraj Marcin <jmarcin@redhat.com>

> 
> diff --git a/migration/ram.c b/migration/ram.c
> index 7208bc114f..e238c9233f 100644
> --- a/migration/ram.c
> +++ b/migration/ram.c
> @@ -3552,7 +3552,6 @@ static inline void *colo_cache_from_block_offset(RAMBlock *block,
>   * determined to be zero, then zap it.
>   *
>   * @host: host address for the zero page
> - * @ch: what the page is filled from.  We only support zero
>   * @size: size of the zero page
>   */
>  void ram_handle_zero(void *host, uint64_t size)
> -- 
> 2.48.1
> 
> 



^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH 1/3] migration: add  FEATURE_SEEKABLE to QIOChannelBlock
  2025-10-02 14:45   ` Juraj Marcin
@ 2025-10-03  9:47     ` Marco Cavenati
  0 siblings, 0 replies; 10+ messages in thread
From: Marco Cavenati @ 2025-10-03  9:47 UTC (permalink / raw)
  To: Juraj Marcin; +Cc: qemu-devel, peterx, farosas, ppandit, berrange

Hi Juraj,

On Thursday, October 02, 2025 16:45 CEST, Juraj Marcin <jmarcin@redhat.com> wrote:

> Hi Marco,
> 
> On 2025-10-01 18:18, Marco Cavenati wrote:
> > Enable the use of the mapped-ram migration feature with savevm/loadvm
> > snapshots by adding the QIO_CHANNEL_FEATURE_SEEKABLE feature to
> > QIOChannelBlock. Implement io_preadv and io_pwritev methods to provide
> > positioned I/O capabilities that don't modify the channel's position
> > pointer.
> > 
> > Signed-off-by: Marco Cavenati <Marco.Cavenati@eurecom.fr>
> > ---
> >  migration/channel-block.c | 48 +++++++++++++++++++++++++++++++++++++++
> >  1 file changed, 48 insertions(+)
> > 
> > diff --git a/migration/channel-block.c b/migration/channel-block.c
> > index 97de5a691b..30dcefcd8e 100644
> > --- a/migration/channel-block.c
> > +++ b/migration/channel-block.c
> > @@ -30,6 +30,7 @@ qio_channel_block_new(BlockDriverState *bs)
> >      QIOChannelBlock *ioc;
> >  
> >      ioc = QIO_CHANNEL_BLOCK(object_new(TYPE_QIO_CHANNEL_BLOCK));
> > +    qio_channel_set_feature(QIO_CHANNEL(ioc), QIO_CHANNEL_FEATURE_SEEKABLE);
> >  
> >      bdrv_ref(bs);
> >      ioc->bs = bs;
> > @@ -96,6 +97,49 @@ qio_channel_block_writev(QIOChannel *ioc,
> >      return qiov.size;
> >  }
> >  
> > +#ifdef CONFIG_PREADV
> 
> I don't think this conditional macro is necessary here. QIOChannelFile
> needs it because it directly uses preadv() syscall which might not be
> available on all systems (see when CONFIG_PREADV is defined below).
> 
>     config_host_data.set('CONFIG_PREADV', cc.has_function('preadv', prefix: '#include <sys/uio.h>'))
> 
> However, bdrv_readv_vmstate() (wrapper of bdrv_co_readv_vmstate())
> should be always available IUUC (it is not conditionally compiled based
> on if preadv() syscall is available).

You are right, thanks for pointing this out, I will remove the macro in the
next iteration.

Just to make sure nothing weird happens without preadv support and without the
macro guard, I compiled QEMU without them and indeed it looks like snapshots
are not supported at all without preadv. savevm/loadvm gracefully fail
independently of mapped-ram on/off.

Best regards,
Marco



^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH 3/3] migration: mapped-ram: handle zero pages
  2025-10-02  8:49   ` Marco Cavenati
@ 2025-10-07 21:11     ` Peter Xu
  2025-10-08  9:36       ` Marco Cavenati
  0 siblings, 1 reply; 10+ messages in thread
From: Peter Xu @ 2025-10-07 21:11 UTC (permalink / raw)
  To: Marco Cavenati; +Cc: qemu-devel, farosas, ppandit, berrange

On Thu, Oct 02, 2025 at 10:49:45AM +0200, Marco Cavenati wrote:
> Please note that there are a couple of errors (swapped parameters), which are
> detailed below.
> I will address these in the next iteration, along with any additional changes
> based on your feedback.
> 
> Thank you
> Marco
> 
> On Wednesday, October 01, 2025 18:18 CEST, Marco Cavenati <Marco.Cavenati@eurecom.fr> wrote:
> 
> > Make mapped-ram compatible with loadvm snapshot restoring by explicitly
> > zeroing memory pages in this case.
> > Skip zeroing for -incoming and -loadvm migrations to preserve performance.
> > 
> > Signed-off-by: Marco Cavenati <Marco.Cavenati@eurecom.fr>
> > ---
> >  migration/ram.c | 56 ++++++++++++++++++++++++++++++++++++++++++++++++-
> >  1 file changed, 55 insertions(+), 1 deletion(-)
> > 
> > diff --git a/migration/ram.c b/migration/ram.c
> > index e238c9233f..597d5ffe9e 100644
> > --- a/migration/ram.c
> > +++ b/migration/ram.c
> > @@ -3958,12 +3958,55 @@ static size_t ram_load_multifd_pages(void *host_addr, size_t size,
> >      return size;
> >  }
> >  
> > +/**
> > + * handle_zero_mapped_ram: Zero out a range of RAM pages if required during
> > + * mapped-ram load
> > + *
> > + * Zeroing is only performed when restoring from a snapshot (HMP loadvm).
> > + * During incoming migration or -loadvm cli snapshot load, the function is a
> > + * no-op and returns true as in those cases the pages are already guaranteed to
> > + * be zeroed.
> > + *
> > + * Returns: true on success, false on error (with @errp set).
> > + * @from_bit_idx: Starting index relative to the map of the page (inclusive)
> > + * @to_bit_idx:   Ending index relative to the map of the page (exclusive)
> > + */
> > +static bool handle_zero_mapped_ram(RAMBlock *block, unsigned long from_bit_idx,
> > +                                   unsigned long to_bit_idx, Error **errp)
> > +{
> > +    ERRP_GUARD();
> > +    ram_addr_t offset;
> > +    size_t size;
> > +    void *host;
> > +
> > +    if (runstate_check(RUN_STATE_INMIGRATE) ||
> > +        runstate_check(RUN_STATE_PRELAUNCH)) {

Should we check RUN_STATE_RESTORE_VM directly here?

I think it's still good to spell out the rest, we could put it in a
comment, e.g.:

  /*
   * Zeroing is not needed for either -loadvm (RUN_STATE_PRELAUNCH), or
   * -incoming (RUN_STATE_INMIGRATE).
   */

> > +        return true;
> > +    }
> > +
> > +    if (from_bit_idx == to_bit_idx) {

Might be safer to check >= rather than ==.

> > +        return true;
> > +    }
> > +
> > +    size = TARGET_PAGE_SIZE * (to_bit_idx - from_bit_idx);
> > +    offset = from_bit_idx << TARGET_PAGE_BITS;
> > +    host = host_from_ram_block_offset(block, offset);
> > +    if (!host) {
> > +        error_setg(errp, "zero page outside of ramblock %s range",
> > +                   block->idstr);
> > +        return false;
> > +    }
> > +    ram_handle_zero(host, size);
> > +
> > +    return true;
> > +}
> > +
> >  static bool read_ramblock_mapped_ram(QEMUFile *f, RAMBlock *block,
> >                                       long num_pages, unsigned long *bitmap,
> >                                       Error **errp)
> >  {
> >      ERRP_GUARD();
> > -    unsigned long set_bit_idx, clear_bit_idx;
> > +    unsigned long set_bit_idx, clear_bit_idx = 0;
> >      ram_addr_t offset;
> >      void *host;
> >      size_t read, unread, size;
> > @@ -3972,6 +4015,12 @@ static bool read_ramblock_mapped_ram(QEMUFile *f, RAMBlock *block,
> >           set_bit_idx < num_pages;
> >           set_bit_idx = find_next_bit(bitmap, num_pages, clear_bit_idx + 1)) {
> >  
> > +        /* Zero pages */
> > +        if (!handle_zero_mapped_ram(block, set_bit_idx, clear_bit_idx, errp)) {
> 
> This should be
> +         if (!handle_zero_mapped_ram(block, clear_bit_idx, set_bit_idx, errp)) {
> 
> > +            return false;
> > +        }
> > +
> > +        /* Non-zero pages */
> >          clear_bit_idx = find_next_zero_bit(bitmap, num_pages, set_bit_idx + 1);
> >  
> >          unread = TARGET_PAGE_SIZE * (clear_bit_idx - set_bit_idx);
> > @@ -4003,6 +4052,11 @@ static bool read_ramblock_mapped_ram(QEMUFile *f, RAMBlock *block,
> >          }
> >      }
> >  
> > +    /* Handle trailing 0 pages */
> > +    if (!handle_zero_mapped_ram(block, num_pages, clear_bit_idx, errp)) {
> 
> This should be
> +    if (!handle_zero_mapped_ram(block, clear_bit_idx, num_pages, errp)) {

The rest looks all good.

I can queue patch 2 now, which is trivial.  Please repost patch 1+3 after
rebasing to Fabiano's patch here:

https://lore.kernel.org/r/20251007184213.5990-1-farosas@suse.de

Then in patch 3 you can remove the MAPPED_RAM cap in the list.

Fabiano could also be posting some test patches too that he got for
snapshots.  You can either respin before that, or wait for it (then you can
also add a mapped-ram test for snapshots).

Thanks,

-- 
Peter Xu



^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH 3/3] migration: mapped-ram: handle zero pages
  2025-10-07 21:11     ` Peter Xu
@ 2025-10-08  9:36       ` Marco Cavenati
  0 siblings, 0 replies; 10+ messages in thread
From: Marco Cavenati @ 2025-10-08  9:36 UTC (permalink / raw)
  To: Peter Xu; +Cc: qemu-devel, farosas, ppandit, berrange

Hello Peter,

On Tuesday, October 07, 2025 23:11 CEST, Peter Xu <peterx@redhat.com> wrote:

> On Thu, Oct 02, 2025 at 10:49:45AM +0200, Marco Cavenati wrote:
> > Please note that there are a couple of errors (swapped parameters), which are
> > detailed below.
> > I will address these in the next iteration, along with any additional changes
> > based on your feedback.
> > 
> > Thank you
> > Marco
> > 
> > On Wednesday, October 01, 2025 18:18 CEST, Marco Cavenati <Marco.Cavenati@eurecom.fr> wrote:
> > 
> > > Make mapped-ram compatible with loadvm snapshot restoring by explicitly
> > > zeroing memory pages in this case.
> > > Skip zeroing for -incoming and -loadvm migrations to preserve performance.
> > > 
> > > Signed-off-by: Marco Cavenati <Marco.Cavenati@eurecom.fr>
> > > ---
> > >  migration/ram.c | 56 ++++++++++++++++++++++++++++++++++++++++++++++++-
> > >  1 file changed, 55 insertions(+), 1 deletion(-)
> > > 
> > > diff --git a/migration/ram.c b/migration/ram.c
> > > index e238c9233f..597d5ffe9e 100644
> > > --- a/migration/ram.c
> > > +++ b/migration/ram.c
> > > @@ -3958,12 +3958,55 @@ static size_t ram_load_multifd_pages(void *host_addr, size_t size,
> > >      return size;
> > >  }
> > >  
> > > +/**
> > > + * handle_zero_mapped_ram: Zero out a range of RAM pages if required during
> > > + * mapped-ram load
> > > + *
> > > + * Zeroing is only performed when restoring from a snapshot (HMP loadvm).
> > > + * During incoming migration or -loadvm cli snapshot load, the function is a
> > > + * no-op and returns true as in those cases the pages are already guaranteed to
> > > + * be zeroed.
> > > + *
> > > + * Returns: true on success, false on error (with @errp set).
> > > + * @from_bit_idx: Starting index relative to the map of the page (inclusive)
> > > + * @to_bit_idx:   Ending index relative to the map of the page (exclusive)
> > > + */
> > > +static bool handle_zero_mapped_ram(RAMBlock *block, unsigned long from_bit_idx,
> > > +                                   unsigned long to_bit_idx, Error **errp)
> > > +{
> > > +    ERRP_GUARD();
> > > +    ram_addr_t offset;
> > > +    size_t size;
> > > +    void *host;
> > > +
> > > +    if (runstate_check(RUN_STATE_INMIGRATE) ||
> > > +        runstate_check(RUN_STATE_PRELAUNCH)) {
> 
> Should we check RUN_STATE_RESTORE_VM directly here?
> 
> I think it's still good to spell out the rest, we could put it in a
> comment, e.g.:
> 
>   /*
>    * Zeroing is not needed for either -loadvm (RUN_STATE_PRELAUNCH), or
>    * -incoming (RUN_STATE_INMIGRATE).
>    */

I have no strong opinion here, but my reasoning was that the "optimization
allow-list" approach might be slightly safer.

If, due to a hypothetical bug, we end up in an optimizable scenario (zeroing
not required) but RUN_STATE is not in the allow-list, migration will only be
slower (I measured a +25% a while back in an arguably bad case scenario).
On the other hand, if RAM actually needs to be zeroed but RUN_STATE is not
RESTORE_VM, then RAM won’t be consistent and the VM could crash.

That said, since zeroing is the "new feature", it also makes sense to adopt the
opposite approach and enable it only for RESTORE_VM.
I will do as you proposed.

> > > +        return true;
> > > +    }
> > > +
> > > +    if (from_bit_idx == to_bit_idx) {
> 
> Might be safer to check >= rather than ==.

Fixed, thanks.

> > > +        return true;
> > > +    }
> > > +
> > > +    size = TARGET_PAGE_SIZE * (to_bit_idx - from_bit_idx);
> > > +    offset = from_bit_idx << TARGET_PAGE_BITS;
> > > +    host = host_from_ram_block_offset(block, offset);
> > > +    if (!host) {
> > > +        error_setg(errp, "zero page outside of ramblock %s range",
> > > +                   block->idstr);
> > > +        return false;
> > > +    }
> > > +    ram_handle_zero(host, size);
> > > +
> > > +    return true;
> > > +}
> > > +
> > >  static bool read_ramblock_mapped_ram(QEMUFile *f, RAMBlock *block,
> > >                                       long num_pages, unsigned long *bitmap,
> > >                                       Error **errp)
> > >  {
> > >      ERRP_GUARD();
> > > -    unsigned long set_bit_idx, clear_bit_idx;
> > > +    unsigned long set_bit_idx, clear_bit_idx = 0;
> > >      ram_addr_t offset;
> > >      void *host;
> > >      size_t read, unread, size;
> > > @@ -3972,6 +4015,12 @@ static bool read_ramblock_mapped_ram(QEMUFile *f, RAMBlock *block,
> > >           set_bit_idx < num_pages;
> > >           set_bit_idx = find_next_bit(bitmap, num_pages, clear_bit_idx + 1)) {
> > >  
> > > +        /* Zero pages */
> > > +        if (!handle_zero_mapped_ram(block, set_bit_idx, clear_bit_idx, errp)) {
> > 
> > This should be
> > +         if (!handle_zero_mapped_ram(block, clear_bit_idx, set_bit_idx, errp)) {
> > 
> > > +            return false;
> > > +        }
> > > +
> > > +        /* Non-zero pages */
> > >          clear_bit_idx = find_next_zero_bit(bitmap, num_pages, set_bit_idx + 1);
> > >  
> > >          unread = TARGET_PAGE_SIZE * (clear_bit_idx - set_bit_idx);
> > > @@ -4003,6 +4052,11 @@ static bool read_ramblock_mapped_ram(QEMUFile *f, RAMBlock *block,
> > >          }
> > >      }
> > >  
> > > +    /* Handle trailing 0 pages */
> > > +    if (!handle_zero_mapped_ram(block, num_pages, clear_bit_idx, errp)) {
> > 
> > This should be
> > +    if (!handle_zero_mapped_ram(block, clear_bit_idx, num_pages, errp)) {
> 
> The rest looks all good.
> 
> I can queue patch 2 now, which is trivial.  Please repost patch 1+3 after
> rebasing to Fabiano's patch here:
> 
> https://lore.kernel.org/r/20251007184213.5990-1-farosas@suse.de
> 
> Then in patch 3 you can remove the MAPPED_RAM cap in the list.
> 
> Fabiano could also be posting some test patches too that he got for
> snapshots.  You can either respin before that, or wait for it (then you can
> also add a mapped-ram test for snapshots).

Ok, thanks!
Marco



^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2025-10-08  9:37 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-10-01 16:18 [PATCH 0/3] migration: Add support for mapped-ram with snapshots Marco Cavenati
2025-10-01 16:18 ` [PATCH 1/3] migration: add FEATURE_SEEKABLE to QIOChannelBlock Marco Cavenati
2025-10-02 14:45   ` Juraj Marcin
2025-10-03  9:47     ` Marco Cavenati
2025-10-01 16:18 ` [PATCH 2/3] migration/ram: fix docs of ram_handle_zero Marco Cavenati
2025-10-02 14:48   ` Juraj Marcin
2025-10-01 16:18 ` [PATCH 3/3] migration: mapped-ram: handle zero pages Marco Cavenati
2025-10-02  8:49   ` Marco Cavenati
2025-10-07 21:11     ` Peter Xu
2025-10-08  9:36       ` Marco Cavenati

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).