All of lore.kernel.org
 help / color / mirror / Atom feed
From: Peter Xu <peterx@redhat.com>
To: qemu-devel@nongnu.org
Cc: "Peter Xu" <peterx@redhat.com>, "Fabiano Rosas" <farosas@suse.de>,
	"Paolo Bonzini" <pbonzini@redhat.com>,
	"Akihiko Odaki" <odaki@rsg.ci.i.u-tokyo.ac.jp>,
	"Philippe Mathieu-Daudé" <philmd@linaro.org>
Subject: [PULL 03/18] system/physmem: Synchronize ram_list accesses
Date: Tue, 23 Jun 2026 08:47:44 -0400	[thread overview]
Message-ID: <20260623124759.125399-4-peterx@redhat.com> (raw)
In-Reply-To: <20260623124759.125399-1-peterx@redhat.com>

From: Akihiko Odaki <odaki@rsg.ci.i.u-tokyo.ac.jp>

Alex Bennée reported a ThreadSanitizer warning about a plain concurrent
access to ram_list [1]. Ensure the concurrent accesses to ram_list are
properly synchronized with atomic accesses, mutexes, or RCU.

First, the plain assignments of ram_list.mru_block are replaced with
qatomic_set(). A comment in qemu_get_ram_block() explains why the
ordering requirement is relaxed, but it still needs to be atomically
accessed. include/qemu/atomic.h says:

> The C11 memory model says that variables that are accessed from
> different threads should at least be done with __ATOMIC_RELAXED
> primitives or the result is undefined. Generally this has little to
> no effect on the generated code but not using the atomic primitives
> will get flagged by sanitizers as a violation.

Second, ram_list.version accesses are replaced with atomic operations
or protected with a mutex. Unlike ram_list.mru_block, ram_list.version
has tighter ordering requirements for one of its goals: ensuring that
the reader-held rs->last_seen_block value is invalidated whenever a RAM
block is reclaimed between two RCU reader critical sections. Below are
steps a reader and an updater follow:

Reader:

R-1. Enter the first RCU read-side critical section:
R-1-1. rs->last_version = qatomic_load_acquire(&ram_list.version)
R-1-2. rs->last_seen_block = an element of ram_list.blocks
R-2. Enter the second RCU read-side critical section:
R-2-1. if (qatomic_read(&ram_list.version) != rs->last_version)
R-2-2.     rs->last_seen_block = NULL

Updater:

W-1. Enter a ram_list.mutex critical section
W-1-1. Update ram_list.blocks
W-1-2. qatomic_store_release(&ram_list.version, ram_list.version + 1)
W-2. Enter another ram_list.mutex critical section
W-2-1. QLIST_REMOVE_RCU(block, next)
W-2-2. qatomic_store_release(&ram_list.version, ram_list.version + 1)
W-2-3. call_rcu(block, reclaim_ramblock, rcu)

W-1-2 represents the write observed by R-1-1.

ram_list.version is read non-atomically on the update side because the
update side is serialized with ram_list.mutex. The other ram_list
accesses in these steps are reasoned about in two cases.

When the grace period of W-2-3 contains R-2:
    qatomic_load_acquire() at R-1-1 and qatomic_store_release() at W-1-2
    enforce the following ordering:
        W-1-1 -> W-1-2 -> R-1-1 -> R-1-2
    The value of ram_list.blocks stored by W-1-1 or a newer value that
    was loaded by R-1-2 is still valid because of the grace period.

When the grace period of W-2-3 ends before R-2:
    call_rcu() at W-2-3 and the read-side critical section at R-2 ensure
    the following ordering:
        W-2-2 -> W-2-3 -> the grace period -> R-2 -> R-2-1
    The value of ram_list.version stored by W-2-2 or a newer value that
    was loaded by R-2-1 differs from rs->last_version and the reader
    invalidates rs->last_seen_block.

Together, these steps ensure that rs->last_seen_block is invalidated
whenever necessary. With added atomic operations, pre-existing memory
barriers are no longer necessary and are removed.

Any other ram_list accesses are already properly synchronized.

[1] https://lore.kernel.org/qemu-devel/878q9fbmap.fsf@draig.linaro.org/

Signed-off-by: Akihiko Odaki <odaki@rsg.ci.i.u-tokyo.ac.jp>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Link: https://lore.kernel.org/r/20260523-tsan-v1-1-07d5eb9dcaa2@rsg.ci.i.u-tokyo.ac.jp
Signed-off-by: Peter Xu <peterx@redhat.com>
---
 migration/ram.c  | 10 +++++-----
 system/physmem.c | 16 +++++++---------
 2 files changed, 12 insertions(+), 14 deletions(-)

diff --git a/migration/ram.c b/migration/ram.c
index fc38ffbf8a..6da24d7258 100644
--- a/migration/ram.c
+++ b/migration/ram.c
@@ -2495,7 +2495,10 @@ static void ram_state_reset(RAMState *rs)
 
     rs->last_seen_block = NULL;
     rs->last_page = 0;
-    rs->last_version = ram_list.version;
+
+    /* Read version before ram_list.blocks */
+    rs->last_version = qatomic_load_acquire(&ram_list.version);
+
     rs->xbzrle_started = false;
 
     ram_page_hint_reset(&rs->page_hint);
@@ -3270,13 +3273,10 @@ static int ram_save_iterate(QEMUFile *f, void *opaque)
      */
     WITH_QEMU_LOCK_GUARD(&rs->bitmap_mutex) {
         WITH_RCU_READ_LOCK_GUARD() {
-            if (ram_list.version != rs->last_version) {
+            if (qatomic_read(&ram_list.version) != rs->last_version) {
                 ram_state_reset(rs);
             }
 
-            /* Read version before ram_list.blocks */
-            smp_rmb();
-
             ret = rdma_registration_start(f, RAM_CONTROL_ROUND);
             if (ret < 0) {
                 qemu_file_set_error(f, ret);
diff --git a/system/physmem.c b/system/physmem.c
index 9e5b50c5b1..db8ad84ab6 100644
--- a/system/physmem.c
+++ b/system/physmem.c
@@ -839,12 +839,12 @@ found:
     /* It is safe to write mru_block outside the BQL.  This
      * is what happens:
      *
-     *     mru_block = xxx
+     *     qatomic_set(&mru_block, xxx)
      *     rcu_read_unlock()
      *                                        xxx removed from list
      *                  rcu_read_lock()
      *                  read mru_block
-     *                                        mru_block = NULL;
+     *                                        qatomic_set(&mru_block, NULL);
      *                                        call_rcu(reclaim_ramblock, xxx);
      *                  rcu_read_unlock()
      *
@@ -852,7 +852,7 @@ found:
      * when it was placed into the list.  Here we're just making an extra
      * copy of the pointer.
      */
-    ram_list.mru_block = block;
+    qatomic_set(&ram_list.mru_block, block);
     return block;
 }
 
@@ -2260,11 +2260,10 @@ static void ram_block_add(RAMBlock *new_block, Error **errp)
     } else { /* list is empty */
         QLIST_INSERT_HEAD_RCU(&ram_list.blocks, new_block, next);
     }
-    ram_list.mru_block = NULL;
+    qatomic_set(&ram_list.mru_block, NULL);
 
     /* Write list before version */
-    smp_wmb();
-    ram_list.version++;
+    qatomic_store_release(&ram_list.version, ram_list.version + 1);
     qemu_mutex_unlock_ramlist();
 
     physical_memory_set_dirty_range(new_block->offset,
@@ -2608,10 +2607,9 @@ void qemu_ram_free(RAMBlock *block)
     name = cpr_name(block->mr);
     cpr_delete_fd(name, 0);
     QLIST_REMOVE_RCU(block, next);
-    ram_list.mru_block = NULL;
+    qatomic_set(&ram_list.mru_block, NULL);
     /* Write list before version */
-    smp_wmb();
-    ram_list.version++;
+    qatomic_store_release(&ram_list.version, ram_list.version + 1);
     call_rcu(block, reclaim_ramblock, rcu);
     qemu_mutex_unlock_ramlist();
 }
-- 
2.54.0



  parent reply	other threads:[~2026-06-23 12:50 UTC|newest]

Thread overview: 20+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-06-23 12:47 [PULL 00/18] Next patches Peter Xu
2026-06-23 12:47 ` [PULL 01/18] thread-pool: Allow at least 1 thread in thread_pool_adjust_max_threads_to_work() Peter Xu
2026-06-23 12:47 ` [PULL 02/18] qapi/migration: Remove @cpr-exec-command doc in MigrationParameter Peter Xu
2026-06-23 12:47 ` Peter Xu [this message]
2026-06-23 12:47 ` [PULL 04/18] system/memory: Remove MAX_PHYS_ADDR Peter Xu
2026-06-23 12:47 ` [PULL 05/18] migration: Use OBJECT_DECLARE_SIMPLE_TYPE Peter Xu
2026-06-23 12:47 ` [PULL 06/18] tests/qtest/migration: Add migration test on loongarch Peter Xu
2026-06-23 12:47 ` [PULL 07/18] migration/tests: Update a-b-boot images for all archs Peter Xu
2026-06-23 12:47 ` [PULL 08/18] system/memory: split RamDiscardManager into source and manager Peter Xu
2026-06-23 12:47 ` [PULL 09/18] system/memory: move RamDiscardManager to separate compilation unit Peter Xu
2026-06-23 12:47 ` [PULL 10/18] system/memory: constify section arguments Peter Xu
2026-06-23 12:47 ` [PULL 11/18] system/ram-discard-manager: implement replay via is_populated iteration Peter Xu
2026-06-23 12:47 ` [PULL 12/18] virtio-mem: remove replay_populated/replay_discarded implementation Peter Xu
2026-06-23 12:47 ` [PULL 13/18] system/ram-discard-manager: drop replay from source interface Peter Xu
2026-06-23 12:47 ` [PULL 14/18] system/memory: implement RamDiscardManager multi-source aggregation Peter Xu
2026-06-23 12:47 ` [PULL 15/18] system/physmem: destroy ram block attributes before RCU-deferred reclaim Peter Xu
2026-06-23 12:47 ` [PULL 16/18] system/memory: add RamDiscardManager reference counting and cleanup Peter Xu
2026-06-23 12:47 ` [PULL 17/18] tests: add unit tests for RamDiscardManager multi-source aggregation Peter Xu
2026-06-23 12:47 ` [PULL 18/18] system/physmem: make ram_block_discard_range() handle guest_memfd Peter Xu
2026-06-25 20:26 ` [PULL 00/18] Next patches Stefan Hajnoczi

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20260623124759.125399-4-peterx@redhat.com \
    --to=peterx@redhat.com \
    --cc=farosas@suse.de \
    --cc=odaki@rsg.ci.i.u-tokyo.ac.jp \
    --cc=pbonzini@redhat.com \
    --cc=philmd@linaro.org \
    --cc=qemu-devel@nongnu.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.