All of lore.kernel.org
 help / color / mirror / Atom feed
From: Peter Xu <peterx@redhat.com>
To: qemu-devel@nongnu.org
Cc: "Peter Xu" <peterx@redhat.com>, "Fabiano Rosas" <farosas@suse.de>,
	"Paolo Bonzini" <pbonzini@redhat.com>,
	"Marc-André Lureau" <marcandre.lureau@redhat.com>,
	"Xiaoyao Li" <xiaoyao.li@intel.com>
Subject: [PULL 18/18] system/physmem: make ram_block_discard_range() handle guest_memfd
Date: Tue, 23 Jun 2026 08:47:59 -0400	[thread overview]
Message-ID: <20260623124759.125399-19-peterx@redhat.com> (raw)
In-Reply-To: <20260623124759.125399-1-peterx@redhat.com>

From: Marc-André Lureau <marcandre.lureau@redhat.com>

Most callers of ram_block_discard_range() want to discard both the
shared and guest_memfd backing. Only kvm_convert_memory() intentionally
discards a single plane during private/shared conversions.

Rename the current implementation to ram_block_discard_shared_range()
and make ram_block_discard_range() a composite that also discards
guest_memfd when present (rb->guest_memfd >= 0). This ensures callers
like virtio-mem, virtio-balloon, hv-balloon, migration.. reclaim
private pages on discard.

Update kvm_convert_memory() to use the plane-specific
ram_block_discard_shared_range() since it only needs to discard
the shared backing when converting to private.

Likewise, after TDVF image copy, use ram_block_discard_shared_range().

Reviewed-by: Peter Xu <peterx@redhat.com>
Reviewed-by: Xiaoyao Li <xiaoyao.li@intel.com>
Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Link: https://lore.kernel.org/r/20260604-rdm5-v5-11-5768e6a0943d@redhat.com
Signed-off-by: Peter Xu <peterx@redhat.com>
---
 include/system/ramblock.h |  3 ++-
 accel/kvm/kvm-all.c       |  2 +-
 system/physmem.c          | 25 +++++++++++++++++++++----
 target/i386/kvm/tdx.c     |  2 +-
 system/trace-events       |  2 +-
 5 files changed, 26 insertions(+), 8 deletions(-)

diff --git a/include/system/ramblock.h b/include/system/ramblock.h
index 2b38718fe5..f0639287bf 100644
--- a/include/system/ramblock.h
+++ b/include/system/ramblock.h
@@ -103,7 +103,8 @@ struct RamBlockAttributes {
 
 /* @offset: the offset within the RAMBlock */
 int ram_block_discard_range(RAMBlock *rb, uint64_t offset, size_t length);
-/* @offset: the offset within the RAMBlock */
+int ram_block_discard_shared_range(RAMBlock *rb, uint64_t offset,
+                                   size_t length);
 int ram_block_discard_guest_memfd_range(RAMBlock *rb, uint64_t offset,
                                         size_t length);
 
diff --git a/accel/kvm/kvm-all.c b/accel/kvm/kvm-all.c
index f4f0e64fbd..8ddfbaff46 100644
--- a/accel/kvm/kvm-all.c
+++ b/accel/kvm/kvm-all.c
@@ -3422,7 +3422,7 @@ int kvm_convert_memory(hwaddr start, hwaddr size, bool to_private)
              */
             goto out_unref;
         }
-        ret = ram_block_discard_range(rb, offset, size);
+        ret = ram_block_discard_shared_range(rb, offset, size);
     } else {
         ret = ram_block_discard_guest_memfd_range(rb, offset, size);
     }
diff --git a/system/physmem.c b/system/physmem.c
index c4bfb57625..c21ea92915 100644
--- a/system/physmem.c
+++ b/system/physmem.c
@@ -4093,7 +4093,7 @@ int qemu_ram_foreach_block(RAMBlockIterFunc func, void *opaque)
  * Returns: 0 on success, none-0 on failure
  *
  */
-int ram_block_discard_range(RAMBlock *rb, uint64_t offset, size_t length)
+int ram_block_discard_shared_range(RAMBlock *rb, uint64_t offset, size_t length)
 {
     int ret = -1;
 
@@ -4142,7 +4142,7 @@ int ram_block_discard_range(RAMBlock *rb, uint64_t offset, size_t length)
              * have a MAP_PRIVATE mapping, possibly messing with other
              * MAP_PRIVATE/MAP_SHARED mappings. There is no easy way to
              * change that behavior whithout violating the promised
-             * semantics of ram_block_discard_range().
+             * semantics of ram_block_discard_shared_range().
              *
              * Only warn, because it works as long as nobody else uses that
              * file.
@@ -4198,8 +4198,9 @@ int ram_block_discard_range(RAMBlock *rb, uint64_t offset, size_t length)
             goto err;
 #endif
         }
-        trace_ram_block_discard_range(rb->idstr, host_startaddr, length,
-                                      need_madvise, need_fallocate, ret);
+        trace_ram_block_discard_shared_range(rb->idstr, host_startaddr, length,
+                                             need_madvise, need_fallocate,
+                                             ret);
     } else {
         error_report("%s: Overrun block '%s' (%" PRIu64 "/%zx/" RAM_ADDR_FMT")",
                      __func__, rb->idstr, offset, length, rb->max_length);
@@ -4209,6 +4210,22 @@ err:
     return ret;
 }
 
+int ram_block_discard_range(RAMBlock *rb, uint64_t offset, size_t length)
+{
+    int ret;
+
+    ret = ram_block_discard_shared_range(rb, offset, length);
+    if (ret) {
+        return ret;
+    }
+
+    if (rb->guest_memfd >= 0) {
+        ret = ram_block_discard_guest_memfd_range(rb, offset, length);
+    }
+
+    return ret;
+}
+
 int ram_block_discard_guest_memfd_range(RAMBlock *rb, uint64_t offset,
                                         size_t length)
 {
diff --git a/target/i386/kvm/tdx.c b/target/i386/kvm/tdx.c
index df46fce769..dfad469112 100644
--- a/target/i386/kvm/tdx.c
+++ b/target/i386/kvm/tdx.c
@@ -385,7 +385,7 @@ static void tdx_finalize_vm(Notifier *notifier, void *unused)
      * KVM_MEMORY_MAPPING. It becomes useless.
      */
     ram_block = tdx_guest->tdvf_mr->ram_block;
-    ram_block_discard_range(ram_block, 0, ram_block->max_length);
+    ram_block_discard_shared_range(ram_block, 0, ram_block->max_length);
 
     tdx_vm_ioctl(KVM_TDX_FINALIZE_VM, 0, NULL, &error_fatal);
     CONFIDENTIAL_GUEST_SUPPORT(tdx_guest)->ready = true;
diff --git a/system/trace-events b/system/trace-events
index e6e1b61279..51b4a4679a 100644
--- a/system/trace-events
+++ b/system/trace-events
@@ -32,7 +32,7 @@ global_dirty_changed(unsigned int bitmask) "bitmask 0x%"PRIx32
 address_space_map(void *as, uint64_t addr, uint64_t len, bool is_write, uint32_t attrs) "as:%p addr 0x%"PRIx64":%"PRIx64" write:%d attrs:0x%x"
 find_ram_offset(uint64_t size, uint64_t offset) "size: 0x%" PRIx64 " @ 0x%" PRIx64
 find_ram_offset_loop(uint64_t size, uint64_t candidate, uint64_t offset, uint64_t next, uint64_t mingap) "trying size: 0x%" PRIx64 " @ 0x%" PRIx64 ", offset: 0x%" PRIx64" next: 0x%" PRIx64 " mingap: 0x%" PRIx64
-ram_block_discard_range(const char *rbname, void *hva, size_t length, bool need_madvise, bool need_fallocate, int ret) "%s@%p + 0x%zx: madvise: %d fallocate: %d ret: %d"
+ram_block_discard_shared_range(const char *rbname, void *hva, size_t length, bool need_madvise, bool need_fallocate, int ret) "%s@%p + 0x%zx: madvise: %d fallocate: %d ret: %d"
 qemu_ram_alloc_shared(const char *name, size_t size, size_t max_size, int fd, void *host) "%s size %zu max_size %zu fd %d host %p"
 
 subpage_register(void *subpage, uint32_t start, uint32_t end, int idx, int eidx, uint16_t section) "subpage %p start 0x%08x end 0x%08x idx 0x%08x eidx 0x%08x section %u"
-- 
2.54.0



  parent reply	other threads:[~2026-06-23 12:49 UTC|newest]

Thread overview: 20+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-06-23 12:47 [PULL 00/18] Next patches Peter Xu
2026-06-23 12:47 ` [PULL 01/18] thread-pool: Allow at least 1 thread in thread_pool_adjust_max_threads_to_work() Peter Xu
2026-06-23 12:47 ` [PULL 02/18] qapi/migration: Remove @cpr-exec-command doc in MigrationParameter Peter Xu
2026-06-23 12:47 ` [PULL 03/18] system/physmem: Synchronize ram_list accesses Peter Xu
2026-06-23 12:47 ` [PULL 04/18] system/memory: Remove MAX_PHYS_ADDR Peter Xu
2026-06-23 12:47 ` [PULL 05/18] migration: Use OBJECT_DECLARE_SIMPLE_TYPE Peter Xu
2026-06-23 12:47 ` [PULL 06/18] tests/qtest/migration: Add migration test on loongarch Peter Xu
2026-06-23 12:47 ` [PULL 07/18] migration/tests: Update a-b-boot images for all archs Peter Xu
2026-06-23 12:47 ` [PULL 08/18] system/memory: split RamDiscardManager into source and manager Peter Xu
2026-06-23 12:47 ` [PULL 09/18] system/memory: move RamDiscardManager to separate compilation unit Peter Xu
2026-06-23 12:47 ` [PULL 10/18] system/memory: constify section arguments Peter Xu
2026-06-23 12:47 ` [PULL 11/18] system/ram-discard-manager: implement replay via is_populated iteration Peter Xu
2026-06-23 12:47 ` [PULL 12/18] virtio-mem: remove replay_populated/replay_discarded implementation Peter Xu
2026-06-23 12:47 ` [PULL 13/18] system/ram-discard-manager: drop replay from source interface Peter Xu
2026-06-23 12:47 ` [PULL 14/18] system/memory: implement RamDiscardManager multi-source aggregation Peter Xu
2026-06-23 12:47 ` [PULL 15/18] system/physmem: destroy ram block attributes before RCU-deferred reclaim Peter Xu
2026-06-23 12:47 ` [PULL 16/18] system/memory: add RamDiscardManager reference counting and cleanup Peter Xu
2026-06-23 12:47 ` [PULL 17/18] tests: add unit tests for RamDiscardManager multi-source aggregation Peter Xu
2026-06-23 12:47 ` Peter Xu [this message]
2026-06-25 20:26 ` [PULL 00/18] Next patches Stefan Hajnoczi

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20260623124759.125399-19-peterx@redhat.com \
    --to=peterx@redhat.com \
    --cc=farosas@suse.de \
    --cc=marcandre.lureau@redhat.com \
    --cc=pbonzini@redhat.com \
    --cc=qemu-devel@nongnu.org \
    --cc=xiaoyao.li@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.