All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH] memory/ramblock: Fix clear of mru_block on possible race condition
@ 2026-06-01 20:59 Peter Xu
  2026-06-02  4:25 ` Akihiko Odaki
  0 siblings, 1 reply; 7+ messages in thread
From: Peter Xu @ 2026-06-01 20:59 UTC (permalink / raw)
  To: qemu-devel
  Cc: peterx, Philippe Mathieu-Daudé, Fabiano Rosas, Paolo Bonzini,
	Alex Bennée, Akihiko Odaki

The race condition was not reported in any real bug, but I found it while
reviewing some relevant chnages from Akihiko [1].  It's not clear if
there's any way to reproduce it, so far it's only theoretical.  Hence
there's also no Fixes tag attached.  There's also no need to copy stable
until we have a solid reproducer.

Currently, mru_block might still points to ramblocks that are removed if
below race condition happens:

  Reader A                              Writer
  --------                              ------
  rcu_read_lock()
  walk list, find block X
                                        QLIST_REMOVE_RCU(X)
                                        qatomic_set(&mru_block, NULL)
                                        call_rcu(X, reclaim_ramblock)
  qatomic_set(&mru_block, X)  <----------- overwrites NULL
  rcu_read_unlock()
                                        grace period ends, X freed

  Reader C
  --------
  rcu_read_lock()
  qatomic_rcu_read(&mru_block) -> X
  Read X's block->offset ...  <----------- UAF

To fix it, we can introduce a nested RCU free for reset of the mru_block
field, and only free the ramblock until the 2nd RCU call.

Since QEMU always: (1) dequeue the RCU node before invoking the
function, (2) reset all fields in rcu_head in the call_rcu1() call, nested
RCU will work all fine like what's used in this patch.

[1] https://lore.kernel.org/r/5fd9e540-cf13-45f5-ba9a-c7faf364035b@rsg.ci.i.u-tokyo.ac.jp

Cc: Akihiko Odaki <odaki@rsg.ci.i.u-tokyo.ac.jp>
Signed-off-by: Peter Xu <peterx@redhat.com>
---

Based-on: <5fd9e540-cf13-45f5-ba9a-c7faf364035b@rsg.ci.i.u-tokyo.ac.jp>

Since there's no reproducer, I didn't verify the problem as-is.  What I did
is I ran this through some high concurrency "make check -jN" (8 processes
with 4 cores pinned, hence 8*N make-check threads running concurrently on 4
physical cores), I didn't see any regression.
---
 system/physmem.c | 18 +++++++++++++++++-
 1 file changed, 17 insertions(+), 1 deletion(-)

diff --git a/system/physmem.c b/system/physmem.c
index 9e1ac13e82..1697568fa7 100644
--- a/system/physmem.c
+++ b/system/physmem.c
@@ -2590,6 +2590,21 @@ static void reclaim_ramblock(RAMBlock *block)
     g_free(block);
 }
 
+static void reclaim_ramblock_prepare(RAMBlock *block)
+{
+    /*
+     * 1st round rcu free guarantees the last reader that can access this
+     * ramblock is gone.  Reset this field making sure it will never point
+     * to the ramblock being freed.
+     */
+    qatomic_set(&ram_list.mru_block, NULL);
+    /*
+     * Note: this is an intended nested use of rcu_head.  If needed, we can
+     * provide two rcu_heads for ramblock.
+     */
+    call_rcu(block, reclaim_ramblock, rcu);
+}
+
 void qemu_ram_free(RAMBlock *block)
 {
     g_autofree char *name = NULL;
@@ -2607,10 +2622,11 @@ void qemu_ram_free(RAMBlock *block)
     name = cpr_name(block->mr);
     cpr_delete_fd(name, 0);
     QLIST_REMOVE_RCU(block, next);
+    /* First attempt of reset, see reclaim_ramblock_prepare() for the 2nd */
     qatomic_set(&ram_list.mru_block, NULL);
     /* Write list before version */
     qatomic_store_release(&ram_list.version, ram_list.version + 1);
-    call_rcu(block, reclaim_ramblock, rcu);
+    call_rcu(block, reclaim_ramblock_prepare, rcu);
     qemu_mutex_unlock_ramlist();
 }
 
-- 
2.53.0



^ permalink raw reply related	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2026-06-04 13:34 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-06-01 20:59 [PATCH] memory/ramblock: Fix clear of mru_block on possible race condition Peter Xu
2026-06-02  4:25 ` Akihiko Odaki
2026-06-02 19:51   ` Peter Xu
2026-06-03  5:11     ` Akihiko Odaki
2026-06-03 17:35       ` Peter Xu
2026-06-04  5:24         ` Akihiko Odaki
2026-06-04 13:33           ` Peter Xu

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.