[PATCH v2 0/4] rust: more memory barriers bindings

NVIDIA GPU driver infrastructure
 help / color / mirror / Atom feed

* [PATCH v2 0/4] rust: more memory barriers bindings
@ 2026-06-09 15:38 Gary Guo
  2026-06-09 15:38 ` [PATCH v2 1/4] rust: sync: add helpers for mb, dma_mb and friends Gary Guo
                   ` (3 more replies)
  0 siblings, 4 replies; 7+ messages in thread
From: Gary Guo @ 2026-06-09 15:38 UTC (permalink / raw)
  To: Miguel Ojeda, Boqun Feng, Björn Roy Baron, Benno Lossin,
	Andreas Hindborg, Alice Ryhl, Trevor Gross, Danilo Krummrich,
	Will Deacon, Peter Zijlstra, Mark Rutland, Alexandre Courbot,
	David Airlie, Simona Vetter
  Cc: Gary Guo, Alan Stern, Andrea Parri, Nicholas Piggin,
	David Howells, Jade Alglave, Luc Maranget, Paul E. McKenney,
	Akira Yokosawa, Daniel Lustig, Joel Fernandes, rust-for-linux,
	linux-kernel, nova-gpu, dri-devel, lkmm, Eliot Courtney

This expands the existing smp barriers to also mandatory barriers and DMA
barriers.

The API looks like: `mb(Ordering)`/`smp_mb(Ordering)`/`dma_mb(Ordering)`,
where `Ordering` is one of `Full`, `Read`, `Write`.

I retain the use of generics to avoid duplicating `CONFIG_SMP` check (but
also retaining the ability to easily add `Acquire` and `Release` ordering
in the future).

A user of these introduced API is included in patch 3, which is a
concurrency bug that exists in Nova code today due to missing barriers.

Signed-off-by: Gary Guo <gary@garyguo.net>
---
Changes in v2:
- Dropped `Acquire` and `Release` aliases of `Full` (Joel)
- Use macros to implement most `MemoryBarrier`
- Split Nova change to GSP->CPU commit and CPU->GSP commit (Joel)
- Link to v1: https://patch.msgid.link/20260402152443.1059634-2-gary@kernel.org

To: Miguel Ojeda <ojeda@kernel.org>
To: Boqun Feng <boqun@kernel.org>
To: Gary Guo <gary@garyguo.net>
To: Björn Roy Baron <bjorn3_gh@protonmail.com>
To: Benno Lossin <lossin@kernel.org>
To: Andreas Hindborg <a.hindborg@kernel.org>
To: Alice Ryhl <aliceryhl@google.com>
To: Trevor Gross <tmgross@umich.edu>
To: Danilo Krummrich <dakr@kernel.org>
To: Will Deacon <will@kernel.org>
To: Peter Zijlstra <peterz@infradead.org>
To: Mark Rutland <mark.rutland@arm.com>
To: Alexandre Courbot <acourbot@nvidia.com>
To: David Airlie <airlied@gmail.com>
To: Simona Vetter <simona@ffwll.ch>
Cc: Alan Stern <stern@rowland.harvard.edu>
Cc: Andrea Parri <parri.andrea@gmail.com>
Cc: Nicholas Piggin <npiggin@gmail.com>
Cc: David Howells <dhowells@redhat.com>
Cc: Jade Alglave <j.alglave@ucl.ac.uk>
Cc: Luc Maranget <luc.maranget@inria.fr>
Cc: "Paul E. McKenney" <paulmck@kernel.org>
Cc: Akira Yokosawa <akiyks@gmail.com>
Cc: Daniel Lustig <dlustig@nvidia.com>
Cc: Joel Fernandes <joelagnelf@nvidia.com>
Cc: rust-for-linux@vger.kernel.org
Cc: linux-kernel@vger.kernel.org
Cc: nova-gpu@lists.linux.dev
Cc: dri-devel@lists.freedesktop.org
Cc: lkmm@lists.linux.dev

---
Gary Guo (4):
      rust: sync: add helpers for mb, dma_mb and friends
      rust: sync: generic memory barriers
      gpu: nova-core: fix barrier usage in CPU->GSP messaging path
      gpu: nova-core: fix barrier usage in GSP->CPU messaging path

 drivers/gpu/nova-core/gsp/cmdq.rs   |  18 ++++++
 drivers/gpu/nova-core/gsp/fw.rs     |  12 ----
 rust/helpers/barrier.c              |  30 +++++++++
 rust/kernel/sync/atomic/ordering.rs |   2 +-
 rust/kernel/sync/barrier.rs         | 123 ++++++++++++++++++++++++++++--------
 5 files changed, 144 insertions(+), 41 deletions(-)
---
base-commit: a87737435cfa134f9cdcc696ba3080759d04cf72
change-id: 20260609-rust-barrier-63078ea76216

Best regards,
--  
Gary Guo <gary@garyguo.net>

^ permalink raw reply	[flat|nested] 7+ messages in thread

* [PATCH v2 1/4] rust: sync: add helpers for mb, dma_mb and friends
  2026-06-09 15:38 [PATCH v2 0/4] rust: more memory barriers bindings Gary Guo
@ 2026-06-09 15:38 ` Gary Guo
  2026-07-02  4:35   ` Boqun Feng
  2026-06-09 15:38 ` [PATCH v2 2/4] rust: sync: generic memory barriers Gary Guo
                   ` (2 subsequent siblings)
  3 siblings, 1 reply; 7+ messages in thread
From: Gary Guo @ 2026-06-09 15:38 UTC (permalink / raw)
  To: Miguel Ojeda, Boqun Feng, Björn Roy Baron, Benno Lossin,
	Andreas Hindborg, Alice Ryhl, Trevor Gross, Danilo Krummrich,
	Will Deacon, Peter Zijlstra, Mark Rutland, Alexandre Courbot,
	David Airlie, Simona Vetter
  Cc: Gary Guo, Alan Stern, Andrea Parri, Nicholas Piggin,
	David Howells, Jade Alglave, Luc Maranget, Paul E. McKenney,
	Akira Yokosawa, Daniel Lustig, Joel Fernandes, rust-for-linux,
	linux-kernel, nova-gpu, dri-devel, lkmm, Eliot Courtney

They supplement the existing smp_mb, smp_rmb and smp_wmb.

Reviewed-by: Eliot Courtney <ecourtney@nvidia.com>
Signed-off-by: Gary Guo <gary@garyguo.net>
---
 rust/helpers/barrier.c | 30 ++++++++++++++++++++++++++++++
 1 file changed, 30 insertions(+)

diff --git a/rust/helpers/barrier.c b/rust/helpers/barrier.c
index fed8853745c8..dbc7a3017c78 100644
--- a/rust/helpers/barrier.c
+++ b/rust/helpers/barrier.c
@@ -2,6 +2,36 @@
 
 #include <asm/barrier.h>
 
+__rust_helper void rust_helper_mb(void)
+{
+	mb();
+}
+
+__rust_helper void rust_helper_rmb(void)
+{
+	rmb();
+}
+
+__rust_helper void rust_helper_wmb(void)
+{
+	wmb();
+}
+
+__rust_helper void rust_helper_dma_mb(void)
+{
+	dma_mb();
+}
+
+__rust_helper void rust_helper_dma_rmb(void)
+{
+	dma_rmb();
+}
+
+__rust_helper void rust_helper_dma_wmb(void)
+{
+	dma_wmb();
+}
+
 __rust_helper void rust_helper_smp_mb(void)
 {
 	smp_mb();

-- 
2.54.0


^ permalink raw reply related	[flat|nested] 7+ messages in thread

* [PATCH v2 2/4] rust: sync: generic memory barriers
  2026-06-09 15:38 [PATCH v2 0/4] rust: more memory barriers bindings Gary Guo
  2026-06-09 15:38 ` [PATCH v2 1/4] rust: sync: add helpers for mb, dma_mb and friends Gary Guo
@ 2026-06-09 15:38 ` Gary Guo
  2026-07-02  4:56   ` Boqun Feng
  2026-06-09 15:38 ` [PATCH v2 3/4] gpu: nova-core: fix barrier usage in CPU->GSP messaging path Gary Guo
  2026-06-09 15:38 ` [PATCH v2 4/4] gpu: nova-core: fix barrier usage in GSP->CPU " Gary Guo
  3 siblings, 1 reply; 7+ messages in thread
From: Gary Guo @ 2026-06-09 15:38 UTC (permalink / raw)
  To: Miguel Ojeda, Boqun Feng, Björn Roy Baron, Benno Lossin,
	Andreas Hindborg, Alice Ryhl, Trevor Gross, Danilo Krummrich,
	Will Deacon, Peter Zijlstra, Mark Rutland, Alexandre Courbot,
	David Airlie, Simona Vetter
  Cc: Gary Guo, Alan Stern, Andrea Parri, Nicholas Piggin,
	David Howells, Jade Alglave, Luc Maranget, Paul E. McKenney,
	Akira Yokosawa, Daniel Lustig, Joel Fernandes, rust-for-linux,
	linux-kernel, nova-gpu, dri-devel, lkmm

Implement a generic interface for memory barriers (full system/DMA/SMP).
The interface uses a parameter to force user to specify their intent with
barriers.

Provide `Read`, `Write`, `Full` orderings which map to the existing
`rmb()`, `wmb()` and `mb()`. Generic is used here instead of providing
individual standalone functions to reduce code duplication; for example,
the `CONFIG_SMP` check in `smp_mb` is uniformly implemented for all SMP
barriers. This could extend to `virt_mb`'s if they're introduced in the
future. It would also make it easier if new ordering types are introduced
in the future (e.g. `Acquire`, `Release`).

Signed-off-by: Gary Guo <gary@garyguo.net>
---
 rust/kernel/sync/atomic/ordering.rs |   2 +-
 rust/kernel/sync/barrier.rs         | 123 ++++++++++++++++++++++++++++--------
 2 files changed, 96 insertions(+), 29 deletions(-)

diff --git a/rust/kernel/sync/atomic/ordering.rs b/rust/kernel/sync/atomic/ordering.rs
index 3f103aa8db99..c4e732e7212f 100644
--- a/rust/kernel/sync/atomic/ordering.rs
+++ b/rust/kernel/sync/atomic/ordering.rs
@@ -15,7 +15,7 @@
 //!   - It provides ordering between the annotated operation and all the following memory accesses.
 //!   - It provides ordering between all the preceding memory accesses and all the following memory
 //!     accesses.
-//!   - All the orderings are the same strength as a full memory barrier (i.e. `smp_mb()`).
+//!   - All the orderings are the same strength as a full memory barrier (i.e. `smp_mb(Full)`).
 //! - [`Relaxed`] provides no ordering except the dependency orderings. Dependency orderings are
 //!   described in "DEPENDENCY RELATIONS" in [`LKMM`]'s [`explanation`].
 //!
diff --git a/rust/kernel/sync/barrier.rs b/rust/kernel/sync/barrier.rs
index 8f2d435fcd94..54c527fdb760 100644
--- a/rust/kernel/sync/barrier.rs
+++ b/rust/kernel/sync/barrier.rs
@@ -7,6 +7,34 @@
 //!
 //! [`LKMM`]: srctree/tools/memory-model/
 
+#![expect(private_bounds, reason = "sealed implementation")]
+
+/// Memory barrier orderings.
+///
+/// The semantics of these orderings follows the [`LKMM`] definitions and rules.
+///
+/// - [`Read`] provides ordering between preceding load operations and succeeding load operations.
+/// - [`Write`] provides ordering between preceding store operations and succeeding store
+///   operations.
+/// - [`Full`] provides ordering between all the preceding memory accesses and succeeding memory
+///   accesses.
+///
+/// [`LKMM`]: srctree/tools/memory-model/
+pub mod ordering {
+    pub use crate::sync::atomic::ordering::Full;
+
+    /// The annotation type for read-read barrier ordering.
+    pub struct Read;
+
+    /// The annotation type for write-write barrier ordering.
+    pub struct Write;
+}
+
+pub use ordering::{Full, Read, Write};
+
+struct Smp;
+struct Dma;
+
 /// A compiler barrier.
 ///
 /// A barrier that prevents compiler from reordering memory accesses across the barrier.
@@ -19,43 +47,82 @@ pub(crate) fn barrier() {
     unsafe { core::arch::asm!("") };
 }
 
-/// A full memory barrier.
+trait MemoryBarrier<Flavour = ()> {
+    fn run();
+}
+
+macro_rules! define_barrier {
+    ($([$flavour:ident])? $ordering:ident, $binding:ident) => {
+        impl MemoryBarrier$(<$flavour>)? for $ordering {
+            #[inline]
+            fn run() {
+                // SAFETY: barrier methods are safe to call.
+                unsafe { bindings::$binding() };
+            }
+        }
+    };
+}
+
+define_barrier!(Full, mb);
+define_barrier!(Read, rmb);
+define_barrier!(Write, wmb);
+define_barrier!([Dma] Full, dma_mb);
+define_barrier!([Dma] Read, dma_rmb);
+define_barrier!([Dma] Write, dma_wmb);
+define_barrier!([Smp] Full, smp_mb);
+define_barrier!([Smp] Read, smp_rmb);
+define_barrier!([Smp] Write, smp_wmb);
+
+/// Memory barrier.
 ///
 /// A barrier that prevents compiler and CPU from reordering memory accesses across the barrier.
-#[inline(always)]
-pub fn smp_mb() {
-    if cfg!(CONFIG_SMP) {
-        // SAFETY: `smp_mb()` is safe to call.
-        unsafe { bindings::smp_mb() };
-    } else {
-        barrier();
-    }
+///
+/// The specific forms of reordering can be specified using the parameter.
+/// - `mb(Read)` provides a read-read barrier.
+/// - `mb(Write)` provides a write-write barrier.
+/// - `mb(Full)` provides a full barrier.
+///
+/// # Examples
+///
+/// ```
+/// # use kernel::sync::barrier::*;
+/// mb(Read);
+/// mb(Write);
+/// mb(Full);
+/// ```
+#[inline]
+#[doc(alias = "rmb")]
+#[doc(alias = "wmb")]
+pub fn mb<T: MemoryBarrier>(_: T) {
+    T::run()
 }
 
-/// A write-write memory barrier.
+/// Memory barrier between CPUs.
 ///
-/// A barrier that prevents compiler and CPU from reordering memory write accesses across the
-/// barrier.
-#[inline(always)]
-pub fn smp_wmb() {
+/// A barrier that prevents compiler and CPU from reordering memory accesses across the barrier.
+/// Does not prevent re-ordering with respect to other bus-mastering devices.
+///
+/// See [`mb`] for usage.
+#[inline]
+#[doc(alias = "smp_rmb")]
+#[doc(alias = "smp_wmb")]
+pub fn smp_mb<T: MemoryBarrier<Smp>>(_: T) {
     if cfg!(CONFIG_SMP) {
-        // SAFETY: `smp_wmb()` is safe to call.
-        unsafe { bindings::smp_wmb() };
+        T::run()
     } else {
-        barrier();
+        barrier()
     }
 }
 
-/// A read-read memory barrier.
+/// Memory barrier between local CPU and bus-mastering devices.
 ///
-/// A barrier that prevents compiler and CPU from reordering memory read accesses across the
-/// barrier.
-#[inline(always)]
-pub fn smp_rmb() {
-    if cfg!(CONFIG_SMP) {
-        // SAFETY: `smp_rmb()` is safe to call.
-        unsafe { bindings::smp_rmb() };
-    } else {
-        barrier();
-    }
+/// A barrier that prevents compiler and CPU from reordering memory accesses across the barrier.
+/// Does not prevent re-ordering with respect to other CPUs.
+///
+/// See [`mb`] for usage.
+#[inline]
+#[doc(alias = "dma_rmb")]
+#[doc(alias = "dma_wmb")]
+pub fn dma_mb<T: MemoryBarrier<Dma>>(_: T) {
+    T::run()
 }

-- 
2.54.0


^ permalink raw reply related	[flat|nested] 7+ messages in thread

* [PATCH v2 3/4] gpu: nova-core: fix barrier usage in CPU->GSP messaging path
  2026-06-09 15:38 [PATCH v2 0/4] rust: more memory barriers bindings Gary Guo
  2026-06-09 15:38 ` [PATCH v2 1/4] rust: sync: add helpers for mb, dma_mb and friends Gary Guo
  2026-06-09 15:38 ` [PATCH v2 2/4] rust: sync: generic memory barriers Gary Guo
@ 2026-06-09 15:38 ` Gary Guo
  2026-06-09 15:38 ` [PATCH v2 4/4] gpu: nova-core: fix barrier usage in GSP->CPU " Gary Guo
  3 siblings, 0 replies; 7+ messages in thread
From: Gary Guo @ 2026-06-09 15:38 UTC (permalink / raw)
  To: Miguel Ojeda, Boqun Feng, Björn Roy Baron, Benno Lossin,
	Andreas Hindborg, Alice Ryhl, Trevor Gross, Danilo Krummrich,
	Will Deacon, Peter Zijlstra, Mark Rutland, Alexandre Courbot,
	David Airlie, Simona Vetter
  Cc: Gary Guo, Alan Stern, Andrea Parri, Nicholas Piggin,
	David Howells, Jade Alglave, Luc Maranget, Paul E. McKenney,
	Akira Yokosawa, Daniel Lustig, Joel Fernandes, rust-for-linux,
	linux-kernel, nova-gpu, dri-devel, lkmm

In the CPU->GSP messaging path, the code reads the read pointer from GSP,
writes the command, advances the write pointer, and then notifies the GSP.

A LOAD->STORE ordering is needed after reading the read pointer from GSP
and writing the command. A barrier is not needed here because control
dependency provides the required ordering. Still, document it explicitly.

A STORE->STORE ordering is needed after the command write and the write
pointer advance. This is currently incorrectly done after the write pointer
advance (and before GSP notification), but this can cause issue if GSP is
still processing ring buffer, as it may observe the write pointer advance
before command write. Thus move this barrier to be before the write pointer
advance. Note that barriers are not needed between write pointer advance
and GSP notification, as MMIO accessors already carries the required
barrier.

Signed-off-by: Gary Guo <gary@garyguo.net>
---
 drivers/gpu/nova-core/gsp/cmdq.rs | 10 ++++++++++
 drivers/gpu/nova-core/gsp/fw.rs   |  3 ---
 2 files changed, 10 insertions(+), 3 deletions(-)

diff --git a/drivers/gpu/nova-core/gsp/cmdq.rs b/drivers/gpu/nova-core/gsp/cmdq.rs
index 070de0731e95..94c2790d943d 100644
--- a/drivers/gpu/nova-core/gsp/cmdq.rs
+++ b/drivers/gpu/nova-core/gsp/cmdq.rs
@@ -20,6 +20,10 @@
     ptr,
     sync::{
         aref::ARef,
+        barrier::{
+            dma_mb,
+            Write, //
+        },
         Mutex, //
     },
     time::Delta,
@@ -274,6 +278,9 @@ fn new(dev: &device::Device<device::Bound>) -> Result<Self> {
             (rx - 1, 0)
         };
 
+        // ORDERING: control dependency provides necessary LOAD->STORE ordering.
+        // `dma_mb(Full)` may be used here if we don't want to rely on control dependency.
+
         // SAFETY:
         // - `data` was created from a valid pointer, and `rx` and `tx` are in the
         //   `0..MSGQ_NUM_PAGES` range per the invariants of `cpu_write_ptr` and `gsp_read_ptr`,
@@ -443,6 +450,9 @@ fn cpu_write_ptr(&self) -> u32 {
 
     // Informs the GSP that it can process `elem_count` new pages from the command queue.
     fn advance_cpu_write_ptr(&mut self, elem_count: u32) {
+        // ORDERING: Ensure all command data is visible before updateing ring buffer pointer.
+        dma_mb(Write);
+
         super::fw::gsp_mem::advance_cpu_write_ptr(&self.0, elem_count)
     }
 }
diff --git a/drivers/gpu/nova-core/gsp/fw.rs b/drivers/gpu/nova-core/gsp/fw.rs
index 4db0cfa4dc4d..fae2eae495d1 100644
--- a/drivers/gpu/nova-core/gsp/fw.rs
+++ b/drivers/gpu/nova-core/gsp/fw.rs
@@ -91,9 +91,6 @@ pub(in crate::gsp) fn advance_cpu_write_ptr(qs: &Coherent<GspMem>, count: u32) {
         let wptr = cpu_write_ptr(qs).wrapping_add(count) % MSGQ_NUM_PAGES;
 
         dma_write!(qs, .cpuq.tx.0.writePtr, wptr);
-
-        // Ensure all command data is visible before triggering the GSP read.
-        fence(Ordering::SeqCst);
     }
 }
 

-- 
2.54.0


^ permalink raw reply related	[flat|nested] 7+ messages in thread

* [PATCH v2 4/4] gpu: nova-core: fix barrier usage in GSP->CPU messaging path
  2026-06-09 15:38 [PATCH v2 0/4] rust: more memory barriers bindings Gary Guo
                   ` (2 preceding siblings ...)
  2026-06-09 15:38 ` [PATCH v2 3/4] gpu: nova-core: fix barrier usage in CPU->GSP messaging path Gary Guo
@ 2026-06-09 15:38 ` Gary Guo
  3 siblings, 0 replies; 7+ messages in thread
From: Gary Guo @ 2026-06-09 15:38 UTC (permalink / raw)
  To: Miguel Ojeda, Boqun Feng, Björn Roy Baron, Benno Lossin,
	Andreas Hindborg, Alice Ryhl, Trevor Gross, Danilo Krummrich,
	Will Deacon, Peter Zijlstra, Mark Rutland, Alexandre Courbot,
	David Airlie, Simona Vetter
  Cc: Gary Guo, Alan Stern, Andrea Parri, Nicholas Piggin,
	David Howells, Jade Alglave, Luc Maranget, Paul E. McKenney,
	Akira Yokosawa, Daniel Lustig, Joel Fernandes, rust-for-linux,
	linux-kernel, nova-gpu, dri-devel, lkmm

In the CPU->GSP messaging path, the code reads the write pointer from GSP,
reads the response and advances the read pointer.

A LOAD->LOAD ordering is required after the write pointer read and the data
read. Add it as this is currently missing.

A LOAD->STORE ordering is required after the data read and the advance of
read pointer. Currently a Rust `SeqCst` barrier is used, which roughly maps
to `smp_mb(Full)`; this however does not order DMA operations (notably on
ARM, the generate barrier orders inner shareable and not outer shareable,
which is ordered by `dma_mb`). This ordering does not need to be in between
read pointer read and write, because it's for ordering between the ring
buffer data and the pointer; the RMW operation does not internally need a
barrier (nor it has to be atomic, as CPU pointers are updated by CPU only),
so move it out to cmdq.rs for clarity.

Signed-off-by: Gary Guo <gary@garyguo.net>
---
 drivers/gpu/nova-core/gsp/cmdq.rs | 8 ++++++++
 drivers/gpu/nova-core/gsp/fw.rs   | 9 ---------
 2 files changed, 8 insertions(+), 9 deletions(-)

diff --git a/drivers/gpu/nova-core/gsp/cmdq.rs b/drivers/gpu/nova-core/gsp/cmdq.rs
index 94c2790d943d..229d49a67077 100644
--- a/drivers/gpu/nova-core/gsp/cmdq.rs
+++ b/drivers/gpu/nova-core/gsp/cmdq.rs
@@ -22,6 +22,8 @@
         aref::ARef,
         barrier::{
             dma_mb,
+            Full,
+            Read,
             Write, //
         },
         Mutex, //
@@ -339,6 +341,9 @@ fn driver_write_area_size(&self) -> usize {
             (MSGQ_NUM_PAGES, tx)
         };
 
+        // ORDERING: Ensure data load is ordered after load of GSP write pointer.
+        dma_mb(Read);
+
         // SAFETY:
         // - `data` was created from a valid pointer, and `rx` and `tx` are in the
         //   `0..MSGQ_NUM_PAGES` range per the invariants of `gsp_write_ptr` and `cpu_read_ptr`,
@@ -436,6 +441,9 @@ fn cpu_read_ptr(&self) -> u32 {
 
     // Informs the GSP that it can send `elem_count` new pages into the message queue.
     fn advance_cpu_read_ptr(&mut self, elem_count: u32) {
+        // ORDERING: Ensure read pointer is properly ordered with a LOAD->STORE ordering.
+        dma_mb(Full);
+
         super::fw::gsp_mem::advance_cpu_read_ptr(&self.0, elem_count)
     }
 
diff --git a/drivers/gpu/nova-core/gsp/fw.rs b/drivers/gpu/nova-core/gsp/fw.rs
index fae2eae495d1..b213385c1f01 100644
--- a/drivers/gpu/nova-core/gsp/fw.rs
+++ b/drivers/gpu/nova-core/gsp/fw.rs
@@ -46,11 +46,6 @@
 
 // TODO: Replace with `IoView` projections once available.
 pub(super) mod gsp_mem {
-    use core::sync::atomic::{
-        fence,
-        Ordering, //
-    };
-
     use kernel::{
         dma::Coherent,
         dma_read,
@@ -76,10 +71,6 @@ pub(in crate::gsp) fn cpu_read_ptr(qs: &Coherent<GspMem>) -> u32 {
 
     pub(in crate::gsp) fn advance_cpu_read_ptr(qs: &Coherent<GspMem>, count: u32) {
         let rptr = cpu_read_ptr(qs).wrapping_add(count) % MSGQ_NUM_PAGES;
-
-        // Ensure read pointer is properly ordered.
-        fence(Ordering::SeqCst);
-
         dma_write!(qs, .cpuq.rx.0.readPtr, rptr);
     }
 

-- 
2.54.0


^ permalink raw reply related	[flat|nested] 7+ messages in thread

* Re: [PATCH v2 1/4] rust: sync: add helpers for mb, dma_mb and friends
  2026-06-09 15:38 ` [PATCH v2 1/4] rust: sync: add helpers for mb, dma_mb and friends Gary Guo
@ 2026-07-02  4:35   ` Boqun Feng
  0 siblings, 0 replies; 7+ messages in thread
From: Boqun Feng @ 2026-07-02  4:35 UTC (permalink / raw)
  To: Gary Guo
  Cc: Miguel Ojeda, Björn Roy Baron, Benno Lossin,
	Andreas Hindborg, Alice Ryhl, Trevor Gross, Danilo Krummrich,
	Will Deacon, Peter Zijlstra, Mark Rutland, Alexandre Courbot,
	David Airlie, Simona Vetter, Alan Stern, Andrea Parri,
	Nicholas Piggin, David Howells, Jade Alglave, Luc Maranget,
	Paul E. McKenney, Akira Yokosawa, Daniel Lustig, Joel Fernandes,
	rust-for-linux, linux-kernel, nova-gpu, dri-devel, lkmm,
	Eliot Courtney

On Tue, Jun 09, 2026 at 04:38:37PM +0100, Gary Guo wrote:
> They supplement the existing smp_mb, smp_rmb and smp_wmb.
> 
> Reviewed-by: Eliot Courtney <ecourtney@nvidia.com>
> Signed-off-by: Gary Guo <gary@garyguo.net>

Queued in rust-sync, thanks!


Regards,
Boqun

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH v2 2/4] rust: sync: generic memory barriers
  2026-06-09 15:38 ` [PATCH v2 2/4] rust: sync: generic memory barriers Gary Guo
@ 2026-07-02  4:56   ` Boqun Feng
  0 siblings, 0 replies; 7+ messages in thread
From: Boqun Feng @ 2026-07-02  4:56 UTC (permalink / raw)
  To: Gary Guo
  Cc: Miguel Ojeda, Björn Roy Baron, Benno Lossin,
	Andreas Hindborg, Alice Ryhl, Trevor Gross, Danilo Krummrich,
	Will Deacon, Peter Zijlstra, Mark Rutland, Alexandre Courbot,
	David Airlie, Simona Vetter, Alan Stern, Andrea Parri,
	Nicholas Piggin, David Howells, Jade Alglave, Luc Maranget,
	Paul E. McKenney, Akira Yokosawa, Daniel Lustig, Joel Fernandes,
	rust-for-linux, linux-kernel, nova-gpu, dri-devel, lkmm

On Tue, Jun 09, 2026 at 04:38:38PM +0100, Gary Guo wrote:
> Implement a generic interface for memory barriers (full system/DMA/SMP).
> The interface uses a parameter to force user to specify their intent with
> barriers.
> 
> Provide `Read`, `Write`, `Full` orderings which map to the existing
> `rmb()`, `wmb()` and `mb()`. Generic is used here instead of providing
> individual standalone functions to reduce code duplication; for example,
> the `CONFIG_SMP` check in `smp_mb` is uniformly implemented for all SMP
> barriers. This could extend to `virt_mb`'s if they're introduced in the
> future. It would also make it easier if new ordering types are introduced
> in the future (e.g. `Acquire`, `Release`).
> 

Queued with minor changes:

* Added "Add" into the commit title.
* Applied vertical import style.

But I do want to get more memory model reviews on this patch series, so
please do take a look if you're interested ;-)

Regards,
Boqun

> Signed-off-by: Gary Guo <gary@garyguo.net>
> ---
>  rust/kernel/sync/atomic/ordering.rs |   2 +-
>  rust/kernel/sync/barrier.rs         | 123 ++++++++++++++++++++++++++++--------
>  2 files changed, 96 insertions(+), 29 deletions(-)
> 
> diff --git a/rust/kernel/sync/atomic/ordering.rs b/rust/kernel/sync/atomic/ordering.rs
> index 3f103aa8db99..c4e732e7212f 100644
> --- a/rust/kernel/sync/atomic/ordering.rs
> +++ b/rust/kernel/sync/atomic/ordering.rs
> @@ -15,7 +15,7 @@
>  //!   - It provides ordering between the annotated operation and all the following memory accesses.
>  //!   - It provides ordering between all the preceding memory accesses and all the following memory
>  //!     accesses.
> -//!   - All the orderings are the same strength as a full memory barrier (i.e. `smp_mb()`).
> +//!   - All the orderings are the same strength as a full memory barrier (i.e. `smp_mb(Full)`).
>  //! - [`Relaxed`] provides no ordering except the dependency orderings. Dependency orderings are
>  //!   described in "DEPENDENCY RELATIONS" in [`LKMM`]'s [`explanation`].
>  //!
> diff --git a/rust/kernel/sync/barrier.rs b/rust/kernel/sync/barrier.rs
> index 8f2d435fcd94..54c527fdb760 100644
> --- a/rust/kernel/sync/barrier.rs
> +++ b/rust/kernel/sync/barrier.rs
> @@ -7,6 +7,34 @@
>  //!
>  //! [`LKMM`]: srctree/tools/memory-model/
>  
> +#![expect(private_bounds, reason = "sealed implementation")]
> +
> +/// Memory barrier orderings.
> +///
> +/// The semantics of these orderings follows the [`LKMM`] definitions and rules.
> +///
> +/// - [`Read`] provides ordering between preceding load operations and succeeding load operations.
> +/// - [`Write`] provides ordering between preceding store operations and succeeding store
> +///   operations.
> +/// - [`Full`] provides ordering between all the preceding memory accesses and succeeding memory
> +///   accesses.
> +///
> +/// [`LKMM`]: srctree/tools/memory-model/
> +pub mod ordering {
> +    pub use crate::sync::atomic::ordering::Full;
> +
> +    /// The annotation type for read-read barrier ordering.
> +    pub struct Read;
> +
> +    /// The annotation type for write-write barrier ordering.
> +    pub struct Write;
> +}
> +
> +pub use ordering::{Full, Read, Write};
> +
> +struct Smp;
> +struct Dma;
> +
>  /// A compiler barrier.
>  ///
>  /// A barrier that prevents compiler from reordering memory accesses across the barrier.
> @@ -19,43 +47,82 @@ pub(crate) fn barrier() {
>      unsafe { core::arch::asm!("") };
>  }
>  
> -/// A full memory barrier.
> +trait MemoryBarrier<Flavour = ()> {
> +    fn run();
> +}
> +
> +macro_rules! define_barrier {
> +    ($([$flavour:ident])? $ordering:ident, $binding:ident) => {
> +        impl MemoryBarrier$(<$flavour>)? for $ordering {
> +            #[inline]
> +            fn run() {
> +                // SAFETY: barrier methods are safe to call.
> +                unsafe { bindings::$binding() };
> +            }
> +        }
> +    };
> +}
> +
> +define_barrier!(Full, mb);
> +define_barrier!(Read, rmb);
> +define_barrier!(Write, wmb);
> +define_barrier!([Dma] Full, dma_mb);
> +define_barrier!([Dma] Read, dma_rmb);
> +define_barrier!([Dma] Write, dma_wmb);
> +define_barrier!([Smp] Full, smp_mb);
> +define_barrier!([Smp] Read, smp_rmb);
> +define_barrier!([Smp] Write, smp_wmb);
> +
> +/// Memory barrier.
>  ///
>  /// A barrier that prevents compiler and CPU from reordering memory accesses across the barrier.
> -#[inline(always)]
> -pub fn smp_mb() {
> -    if cfg!(CONFIG_SMP) {
> -        // SAFETY: `smp_mb()` is safe to call.
> -        unsafe { bindings::smp_mb() };
> -    } else {
> -        barrier();
> -    }
> +///
> +/// The specific forms of reordering can be specified using the parameter.
> +/// - `mb(Read)` provides a read-read barrier.
> +/// - `mb(Write)` provides a write-write barrier.
> +/// - `mb(Full)` provides a full barrier.
> +///
> +/// # Examples
> +///
> +/// ```
> +/// # use kernel::sync::barrier::*;
> +/// mb(Read);
> +/// mb(Write);
> +/// mb(Full);
> +/// ```
> +#[inline]
> +#[doc(alias = "rmb")]
> +#[doc(alias = "wmb")]
> +pub fn mb<T: MemoryBarrier>(_: T) {
> +    T::run()
>  }
>  
> -/// A write-write memory barrier.
> +/// Memory barrier between CPUs.
>  ///
> -/// A barrier that prevents compiler and CPU from reordering memory write accesses across the
> -/// barrier.
> -#[inline(always)]
> -pub fn smp_wmb() {
> +/// A barrier that prevents compiler and CPU from reordering memory accesses across the barrier.
> +/// Does not prevent re-ordering with respect to other bus-mastering devices.
> +///
> +/// See [`mb`] for usage.
> +#[inline]
> +#[doc(alias = "smp_rmb")]
> +#[doc(alias = "smp_wmb")]
> +pub fn smp_mb<T: MemoryBarrier<Smp>>(_: T) {
>      if cfg!(CONFIG_SMP) {
> -        // SAFETY: `smp_wmb()` is safe to call.
> -        unsafe { bindings::smp_wmb() };
> +        T::run()
>      } else {
> -        barrier();
> +        barrier()
>      }
>  }
>  
> -/// A read-read memory barrier.
> +/// Memory barrier between local CPU and bus-mastering devices.
>  ///
> -/// A barrier that prevents compiler and CPU from reordering memory read accesses across the
> -/// barrier.
> -#[inline(always)]
> -pub fn smp_rmb() {
> -    if cfg!(CONFIG_SMP) {
> -        // SAFETY: `smp_rmb()` is safe to call.
> -        unsafe { bindings::smp_rmb() };
> -    } else {
> -        barrier();
> -    }
> +/// A barrier that prevents compiler and CPU from reordering memory accesses across the barrier.
> +/// Does not prevent re-ordering with respect to other CPUs.
> +///
> +/// See [`mb`] for usage.
> +#[inline]
> +#[doc(alias = "dma_rmb")]
> +#[doc(alias = "dma_wmb")]
> +pub fn dma_mb<T: MemoryBarrier<Dma>>(_: T) {
> +    T::run()
>  }
> 
> -- 
> 2.54.0
> 

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2026-07-02  4:56 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-06-09 15:38 [PATCH v2 0/4] rust: more memory barriers bindings Gary Guo
2026-06-09 15:38 ` [PATCH v2 1/4] rust: sync: add helpers for mb, dma_mb and friends Gary Guo
2026-07-02  4:35   ` Boqun Feng
2026-06-09 15:38 ` [PATCH v2 2/4] rust: sync: generic memory barriers Gary Guo
2026-07-02  4:56   ` Boqun Feng
2026-06-09 15:38 ` [PATCH v2 3/4] gpu: nova-core: fix barrier usage in CPU->GSP messaging path Gary Guo
2026-06-09 15:38 ` [PATCH v2 4/4] gpu: nova-core: fix barrier usage in GSP->CPU " Gary Guo

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox