All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH] system/physmem: Synchronize ram_list accesses
@ 2026-05-23 12:38 Akihiko Odaki
  2026-05-26  9:22 ` Philippe Mathieu-Daudé
  2026-05-26 21:35 ` Peter Xu
  0 siblings, 2 replies; 13+ messages in thread
From: Akihiko Odaki @ 2026-05-23 12:38 UTC (permalink / raw)
  To: qemu-devel
  Cc: Alex Bennée, Peter Xu, Fabiano Rosas, Paolo Bonzini,
	Philippe Mathieu-Daudé, Akihiko Odaki

Alex Bennée reported a ThreadSanitizer warning about a plain concurrent
access to ram_list [1]. Ensure the concurrent accesses to ram_list are
properly synchronized with atomic accesses, mutexes, or RCU.

First, the plain assignments of ram_list.mru_block are replaced with
qatomic_set(). A comment in qemu_get_ram_block() explains why the
ordering requirement is relaxed, but it still needs to be atomically
accessed. include/qemu/atomic.h says:

> The C11 memory model says that variables that are accessed from
> different threads should at least be done with __ATOMIC_RELAXED
> primitives or the result is undefined. Generally this has little to
> no effect on the generated code but not using the atomic primitives
> will get flagged by sanitizers as a violation.

Second, ram_list.version accesses are replaced with atomic operations
or protected with a mutex. Unlike ram_list.mru_block, ram_list.version
has tighter ordering requirements for one of its goals: ensuring that
the reader-held rs->last_seen_block value is invalidated whenever a RAM
block is reclaimed between two RCU reader critical sections. Below are
steps a reader and an updater follow:

Reader:

R-1. Enter the first RCU read-side critical section:
R-1-1. rs->last_version = qatomic_load_acquire(&ram_list.version)
R-1-2. rs->last_seen_block = an element of ram_list.blocks
R-2. Enter the second RCU read-side critical section:
R-2-1. if (qatomic_read(&ram_list.version) != rs->last_version)
R-2-2.     rs->last_seen_block = NULL

Updater:

W-1. Enter a ram_list.mutex critical section
W-1-1. Update ram_list.blocks
W-1-2. qatomic_store_release(&ram_list.version, ram_list.version + 1)
W-2. Enter another ram_list.mutex critical section
W-2-1. QLIST_REMOVE_RCU(block, next)
W-2-2. qatomic_store_release(&ram_list.version, ram_list.version + 1)
W-2-3. call_rcu(block, reclaim_ramblock, rcu)

W-1-2 represents the write observed by R-1-1.

ram_list.version is read non-atomically on the update side because the
update side is serialized with ram_list.mutex. The other ram_list
accesses in these steps are reasoned about in two cases.

When the grace period of W-2-3 contains R-2:
    qatomic_load_acquire() at R-1-1 and qatomic_store_release() at W-1-2
    enforce the following ordering:
        W-1-1 -> W-1-2 -> R-1-1 -> R-1-2
    The value of ram_list.blocks stored by W-1-1 or a newer value that
    was loaded by R-1-2 is still valid because of the grace period.

When the grace period of W-2-3 ends before R-2:
    call_rcu() at W-2-3 and the read-side critical section at R-2 ensure
    the following ordering:
        W-2-2 -> W-2-3 -> the grace period -> R-2 -> R-2-1
    The value of ram_list.version stored by W-2-2 or a newer value that
    was loaded by R-2-1 differs from rs->last_version and the reader
    invalidates rs->last_seen_block.

Together, these steps ensure that rs->last_seen_block is invalidated
whenever necessary. With added atomic operations, pre-existing memory
barriers are no longer necessary and are removed.

Any other ram_list accesses are already properly synchronized.

[1] https://lore.kernel.org/qemu-devel/878q9fbmap.fsf@draig.linaro.org/

Signed-off-by: Akihiko Odaki <odaki@rsg.ci.i.u-tokyo.ac.jp>
---
 migration/ram.c  | 10 +++++-----
 system/physmem.c | 16 +++++++---------
 2 files changed, 12 insertions(+), 14 deletions(-)

diff --git a/migration/ram.c b/migration/ram.c
index fc38ffbf8af1..6da24d725861 100644
--- a/migration/ram.c
+++ b/migration/ram.c
@@ -2495,7 +2495,10 @@ static void ram_state_reset(RAMState *rs)
 
     rs->last_seen_block = NULL;
     rs->last_page = 0;
-    rs->last_version = ram_list.version;
+
+    /* Read version before ram_list.blocks */
+    rs->last_version = qatomic_load_acquire(&ram_list.version);
+
     rs->xbzrle_started = false;
 
     ram_page_hint_reset(&rs->page_hint);
@@ -3270,13 +3273,10 @@ static int ram_save_iterate(QEMUFile *f, void *opaque)
      */
     WITH_QEMU_LOCK_GUARD(&rs->bitmap_mutex) {
         WITH_RCU_READ_LOCK_GUARD() {
-            if (ram_list.version != rs->last_version) {
+            if (qatomic_read(&ram_list.version) != rs->last_version) {
                 ram_state_reset(rs);
             }
 
-            /* Read version before ram_list.blocks */
-            smp_rmb();
-
             ret = rdma_registration_start(f, RAM_CONTROL_ROUND);
             if (ret < 0) {
                 qemu_file_set_error(f, ret);
diff --git a/system/physmem.c b/system/physmem.c
index 7bcbf8757361..9e1ac13e8259 100644
--- a/system/physmem.c
+++ b/system/physmem.c
@@ -839,12 +839,12 @@ found:
     /* It is safe to write mru_block outside the BQL.  This
      * is what happens:
      *
-     *     mru_block = xxx
+     *     qatomic_set(&mru_block, xxx)
      *     rcu_read_unlock()
      *                                        xxx removed from list
      *                  rcu_read_lock()
      *                  read mru_block
-     *                                        mru_block = NULL;
+     *                                        qatomic_set(&mru_block, NULL);
      *                                        call_rcu(reclaim_ramblock, xxx);
      *                  rcu_read_unlock()
      *
@@ -852,7 +852,7 @@ found:
      * when it was placed into the list.  Here we're just making an extra
      * copy of the pointer.
      */
-    ram_list.mru_block = block;
+    qatomic_set(&ram_list.mru_block, block);
     return block;
 }
 
@@ -2260,11 +2260,10 @@ static void ram_block_add(RAMBlock *new_block, Error **errp)
     } else { /* list is empty */
         QLIST_INSERT_HEAD_RCU(&ram_list.blocks, new_block, next);
     }
-    ram_list.mru_block = NULL;
+    qatomic_set(&ram_list.mru_block, NULL);
 
     /* Write list before version */
-    smp_wmb();
-    ram_list.version++;
+    qatomic_store_release(&ram_list.version, ram_list.version + 1);
     qemu_mutex_unlock_ramlist();
 
     physical_memory_set_dirty_range(new_block->offset,
@@ -2608,10 +2607,9 @@ void qemu_ram_free(RAMBlock *block)
     name = cpr_name(block->mr);
     cpr_delete_fd(name, 0);
     QLIST_REMOVE_RCU(block, next);
-    ram_list.mru_block = NULL;
+    qatomic_set(&ram_list.mru_block, NULL);
     /* Write list before version */
-    smp_wmb();
-    ram_list.version++;
+    qatomic_store_release(&ram_list.version, ram_list.version + 1);
     call_rcu(block, reclaim_ramblock, rcu);
     qemu_mutex_unlock_ramlist();
 }

---
base-commit: e89049b3ba5f1f0468bc0d294173345597514a1b
change-id: 20260520-tsan-c6f1aa909a90

Best regards,
--  
Akihiko Odaki <odaki@rsg.ci.i.u-tokyo.ac.jp>



^ permalink raw reply related	[flat|nested] 13+ messages in thread

end of thread, other threads:[~2026-06-01 21:02 UTC | newest]

Thread overview: 13+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-05-23 12:38 [PATCH] system/physmem: Synchronize ram_list accesses Akihiko Odaki
2026-05-26  9:22 ` Philippe Mathieu-Daudé
2026-05-26 11:58   ` Akihiko Odaki
2026-05-26 12:10     ` Philippe Mathieu-Daudé
2026-05-26 21:35 ` Peter Xu
2026-05-27  7:54   ` Akihiko Odaki
2026-05-28 15:18     ` Peter Xu
2026-05-28 15:32       ` Akihiko Odaki
2026-05-28 16:03         ` Peter Xu
2026-05-29  8:24           ` Akihiko Odaki
2026-05-29 16:13             ` Peter Xu
2026-05-29 16:39               ` Akihiko Odaki
2026-06-01 21:01                 ` Peter Xu

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.