[PATCH v2 0/4] rust: more memory barriers bindings

All of lore.kernel.org
 help / color / mirror / Atom feed

* [PATCH v2 0/4] rust: more memory barriers bindings
@ 2026-06-09 15:38 Gary Guo
  2026-06-09 15:38 ` [PATCH v2 1/4] rust: sync: add helpers for mb, dma_mb and friends Gary Guo
                   ` (3 more replies)
  0 siblings, 4 replies; 8+ messages in thread
From: Gary Guo @ 2026-06-09 15:38 UTC (permalink / raw)
  To: Miguel Ojeda, Boqun Feng, Björn Roy Baron, Benno Lossin,
	Andreas Hindborg, Alice Ryhl, Trevor Gross, Danilo Krummrich,
	Will Deacon, Peter Zijlstra, Mark Rutland, Alexandre Courbot,
	David Airlie, Simona Vetter
  Cc: Gary Guo, Alan Stern, Andrea Parri, Nicholas Piggin,
	David Howells, Jade Alglave, Luc Maranget, Paul E. McKenney,
	Akira Yokosawa, Daniel Lustig, Joel Fernandes, rust-for-linux,
	linux-kernel, nova-gpu, dri-devel, lkmm, Eliot Courtney

This expands the existing smp barriers to also mandatory barriers and DMA
barriers.

The API looks like: `mb(Ordering)`/`smp_mb(Ordering)`/`dma_mb(Ordering)`,
where `Ordering` is one of `Full`, `Read`, `Write`.

I retain the use of generics to avoid duplicating `CONFIG_SMP` check (but
also retaining the ability to easily add `Acquire` and `Release` ordering
in the future).

A user of these introduced API is included in patch 3, which is a
concurrency bug that exists in Nova code today due to missing barriers.

Signed-off-by: Gary Guo <gary@garyguo.net>
---
Changes in v2:
- Dropped `Acquire` and `Release` aliases of `Full` (Joel)
- Use macros to implement most `MemoryBarrier`
- Split Nova change to GSP->CPU commit and CPU->GSP commit (Joel)
- Link to v1: https://patch.msgid.link/20260402152443.1059634-2-gary@kernel.org

To: Miguel Ojeda <ojeda@kernel.org>
To: Boqun Feng <boqun@kernel.org>
To: Gary Guo <gary@garyguo.net>
To: Björn Roy Baron <bjorn3_gh@protonmail.com>
To: Benno Lossin <lossin@kernel.org>
To: Andreas Hindborg <a.hindborg@kernel.org>
To: Alice Ryhl <aliceryhl@google.com>
To: Trevor Gross <tmgross@umich.edu>
To: Danilo Krummrich <dakr@kernel.org>
To: Will Deacon <will@kernel.org>
To: Peter Zijlstra <peterz@infradead.org>
To: Mark Rutland <mark.rutland@arm.com>
To: Alexandre Courbot <acourbot@nvidia.com>
To: David Airlie <airlied@gmail.com>
To: Simona Vetter <simona@ffwll.ch>
Cc: Alan Stern <stern@rowland.harvard.edu>
Cc: Andrea Parri <parri.andrea@gmail.com>
Cc: Nicholas Piggin <npiggin@gmail.com>
Cc: David Howells <dhowells@redhat.com>
Cc: Jade Alglave <j.alglave@ucl.ac.uk>
Cc: Luc Maranget <luc.maranget@inria.fr>
Cc: "Paul E. McKenney" <paulmck@kernel.org>
Cc: Akira Yokosawa <akiyks@gmail.com>
Cc: Daniel Lustig <dlustig@nvidia.com>
Cc: Joel Fernandes <joelagnelf@nvidia.com>
Cc: rust-for-linux@vger.kernel.org
Cc: linux-kernel@vger.kernel.org
Cc: nova-gpu@lists.linux.dev
Cc: dri-devel@lists.freedesktop.org
Cc: lkmm@lists.linux.dev

---
Gary Guo (4):
      rust: sync: add helpers for mb, dma_mb and friends
      rust: sync: generic memory barriers
      gpu: nova-core: fix barrier usage in CPU->GSP messaging path
      gpu: nova-core: fix barrier usage in GSP->CPU messaging path

 drivers/gpu/nova-core/gsp/cmdq.rs   |  18 ++++++
 drivers/gpu/nova-core/gsp/fw.rs     |  12 ----
 rust/helpers/barrier.c              |  30 +++++++++
 rust/kernel/sync/atomic/ordering.rs |   2 +-
 rust/kernel/sync/barrier.rs         | 123 ++++++++++++++++++++++++++++--------
 5 files changed, 144 insertions(+), 41 deletions(-)
---
base-commit: a87737435cfa134f9cdcc696ba3080759d04cf72
change-id: 20260609-rust-barrier-63078ea76216

Best regards,
--  
Gary Guo <gary@garyguo.net>

^ permalink raw reply	[flat|nested] 8+ messages in thread

* [PATCH v2 1/4] rust: sync: add helpers for mb, dma_mb and friends
  2026-06-09 15:38 [PATCH v2 0/4] rust: more memory barriers bindings Gary Guo
@ 2026-06-09 15:38 ` Gary Guo
  2026-06-09 15:38 ` [PATCH v2 2/4] rust: sync: generic memory barriers Gary Guo
                   ` (2 subsequent siblings)
  3 siblings, 0 replies; 8+ messages in thread
From: Gary Guo @ 2026-06-09 15:38 UTC (permalink / raw)
  To: Miguel Ojeda, Boqun Feng, Björn Roy Baron, Benno Lossin,
	Andreas Hindborg, Alice Ryhl, Trevor Gross, Danilo Krummrich,
	Will Deacon, Peter Zijlstra, Mark Rutland, Alexandre Courbot,
	David Airlie, Simona Vetter
  Cc: Gary Guo, Alan Stern, Andrea Parri, Nicholas Piggin,
	David Howells, Jade Alglave, Luc Maranget, Paul E. McKenney,
	Akira Yokosawa, Daniel Lustig, Joel Fernandes, rust-for-linux,
	linux-kernel, nova-gpu, dri-devel, lkmm, Eliot Courtney

They supplement the existing smp_mb, smp_rmb and smp_wmb.

Reviewed-by: Eliot Courtney <ecourtney@nvidia.com>
Signed-off-by: Gary Guo <gary@garyguo.net>
---
 rust/helpers/barrier.c | 30 ++++++++++++++++++++++++++++++
 1 file changed, 30 insertions(+)

diff --git a/rust/helpers/barrier.c b/rust/helpers/barrier.c
index fed8853745c8..dbc7a3017c78 100644
--- a/rust/helpers/barrier.c
+++ b/rust/helpers/barrier.c
@@ -2,6 +2,36 @@
 
 #include <asm/barrier.h>
 
+__rust_helper void rust_helper_mb(void)
+{
+	mb();
+}
+
+__rust_helper void rust_helper_rmb(void)
+{
+	rmb();
+}
+
+__rust_helper void rust_helper_wmb(void)
+{
+	wmb();
+}
+
+__rust_helper void rust_helper_dma_mb(void)
+{
+	dma_mb();
+}
+
+__rust_helper void rust_helper_dma_rmb(void)
+{
+	dma_rmb();
+}
+
+__rust_helper void rust_helper_dma_wmb(void)
+{
+	dma_wmb();
+}
+
 __rust_helper void rust_helper_smp_mb(void)
 {
 	smp_mb();

-- 
2.54.0


^ permalink raw reply related	[flat|nested] 8+ messages in thread

* [PATCH v2 2/4] rust: sync: generic memory barriers
  2026-06-09 15:38 [PATCH v2 0/4] rust: more memory barriers bindings Gary Guo
  2026-06-09 15:38 ` [PATCH v2 1/4] rust: sync: add helpers for mb, dma_mb and friends Gary Guo
@ 2026-06-09 15:38 ` Gary Guo
  2026-06-09 15:48   ` sashiko-bot
  2026-06-09 15:38 ` [PATCH v2 3/4] gpu: nova-core: fix barrier usage in CPU->GSP messaging path Gary Guo
  2026-06-09 15:38 ` [PATCH v2 4/4] gpu: nova-core: fix barrier usage in GSP->CPU " Gary Guo
  3 siblings, 1 reply; 8+ messages in thread
From: Gary Guo @ 2026-06-09 15:38 UTC (permalink / raw)
  To: Miguel Ojeda, Boqun Feng, Björn Roy Baron, Benno Lossin,
	Andreas Hindborg, Alice Ryhl, Trevor Gross, Danilo Krummrich,
	Will Deacon, Peter Zijlstra, Mark Rutland, Alexandre Courbot,
	David Airlie, Simona Vetter
  Cc: Gary Guo, Alan Stern, Andrea Parri, Nicholas Piggin,
	David Howells, Jade Alglave, Luc Maranget, Paul E. McKenney,
	Akira Yokosawa, Daniel Lustig, Joel Fernandes, rust-for-linux,
	linux-kernel, nova-gpu, dri-devel, lkmm

Implement a generic interface for memory barriers (full system/DMA/SMP).
The interface uses a parameter to force user to specify their intent with
barriers.

Provide `Read`, `Write`, `Full` orderings which map to the existing
`rmb()`, `wmb()` and `mb()`. Generic is used here instead of providing
individual standalone functions to reduce code duplication; for example,
the `CONFIG_SMP` check in `smp_mb` is uniformly implemented for all SMP
barriers. This could extend to `virt_mb`'s if they're introduced in the
future. It would also make it easier if new ordering types are introduced
in the future (e.g. `Acquire`, `Release`).

Signed-off-by: Gary Guo <gary@garyguo.net>
---
 rust/kernel/sync/atomic/ordering.rs |   2 +-
 rust/kernel/sync/barrier.rs         | 123 ++++++++++++++++++++++++++++--------
 2 files changed, 96 insertions(+), 29 deletions(-)

diff --git a/rust/kernel/sync/atomic/ordering.rs b/rust/kernel/sync/atomic/ordering.rs
index 3f103aa8db99..c4e732e7212f 100644
--- a/rust/kernel/sync/atomic/ordering.rs
+++ b/rust/kernel/sync/atomic/ordering.rs
@@ -15,7 +15,7 @@
 //!   - It provides ordering between the annotated operation and all the following memory accesses.
 //!   - It provides ordering between all the preceding memory accesses and all the following memory
 //!     accesses.
-//!   - All the orderings are the same strength as a full memory barrier (i.e. `smp_mb()`).
+//!   - All the orderings are the same strength as a full memory barrier (i.e. `smp_mb(Full)`).
 //! - [`Relaxed`] provides no ordering except the dependency orderings. Dependency orderings are
 //!   described in "DEPENDENCY RELATIONS" in [`LKMM`]'s [`explanation`].
 //!
diff --git a/rust/kernel/sync/barrier.rs b/rust/kernel/sync/barrier.rs
index 8f2d435fcd94..54c527fdb760 100644
--- a/rust/kernel/sync/barrier.rs
+++ b/rust/kernel/sync/barrier.rs
@@ -7,6 +7,34 @@
 //!
 //! [`LKMM`]: srctree/tools/memory-model/
 
+#![expect(private_bounds, reason = "sealed implementation")]
+
+/// Memory barrier orderings.
+///
+/// The semantics of these orderings follows the [`LKMM`] definitions and rules.
+///
+/// - [`Read`] provides ordering between preceding load operations and succeeding load operations.
+/// - [`Write`] provides ordering between preceding store operations and succeeding store
+///   operations.
+/// - [`Full`] provides ordering between all the preceding memory accesses and succeeding memory
+///   accesses.
+///
+/// [`LKMM`]: srctree/tools/memory-model/
+pub mod ordering {
+    pub use crate::sync::atomic::ordering::Full;
+
+    /// The annotation type for read-read barrier ordering.
+    pub struct Read;
+
+    /// The annotation type for write-write barrier ordering.
+    pub struct Write;
+}
+
+pub use ordering::{Full, Read, Write};
+
+struct Smp;
+struct Dma;
+
 /// A compiler barrier.
 ///
 /// A barrier that prevents compiler from reordering memory accesses across the barrier.
@@ -19,43 +47,82 @@ pub(crate) fn barrier() {
     unsafe { core::arch::asm!("") };
 }
 
-/// A full memory barrier.
+trait MemoryBarrier<Flavour = ()> {
+    fn run();
+}
+
+macro_rules! define_barrier {
+    ($([$flavour:ident])? $ordering:ident, $binding:ident) => {
+        impl MemoryBarrier$(<$flavour>)? for $ordering {
+            #[inline]
+            fn run() {
+                // SAFETY: barrier methods are safe to call.
+                unsafe { bindings::$binding() };
+            }
+        }
+    };
+}
+
+define_barrier!(Full, mb);
+define_barrier!(Read, rmb);
+define_barrier!(Write, wmb);
+define_barrier!([Dma] Full, dma_mb);
+define_barrier!([Dma] Read, dma_rmb);
+define_barrier!([Dma] Write, dma_wmb);
+define_barrier!([Smp] Full, smp_mb);
+define_barrier!([Smp] Read, smp_rmb);
+define_barrier!([Smp] Write, smp_wmb);
+
+/// Memory barrier.
 ///
 /// A barrier that prevents compiler and CPU from reordering memory accesses across the barrier.
-#[inline(always)]
-pub fn smp_mb() {
-    if cfg!(CONFIG_SMP) {
-        // SAFETY: `smp_mb()` is safe to call.
-        unsafe { bindings::smp_mb() };
-    } else {
-        barrier();
-    }
+///
+/// The specific forms of reordering can be specified using the parameter.
+/// - `mb(Read)` provides a read-read barrier.
+/// - `mb(Write)` provides a write-write barrier.
+/// - `mb(Full)` provides a full barrier.
+///
+/// # Examples
+///
+/// ```
+/// # use kernel::sync::barrier::*;
+/// mb(Read);
+/// mb(Write);
+/// mb(Full);
+/// ```
+#[inline]
+#[doc(alias = "rmb")]
+#[doc(alias = "wmb")]
+pub fn mb<T: MemoryBarrier>(_: T) {
+    T::run()
 }
 
-/// A write-write memory barrier.
+/// Memory barrier between CPUs.
 ///
-/// A barrier that prevents compiler and CPU from reordering memory write accesses across the
-/// barrier.
-#[inline(always)]
-pub fn smp_wmb() {
+/// A barrier that prevents compiler and CPU from reordering memory accesses across the barrier.
+/// Does not prevent re-ordering with respect to other bus-mastering devices.
+///
+/// See [`mb`] for usage.
+#[inline]
+#[doc(alias = "smp_rmb")]
+#[doc(alias = "smp_wmb")]
+pub fn smp_mb<T: MemoryBarrier<Smp>>(_: T) {
     if cfg!(CONFIG_SMP) {
-        // SAFETY: `smp_wmb()` is safe to call.
-        unsafe { bindings::smp_wmb() };
+        T::run()
     } else {
-        barrier();
+        barrier()
     }
 }
 
-/// A read-read memory barrier.
+/// Memory barrier between local CPU and bus-mastering devices.
 ///
-/// A barrier that prevents compiler and CPU from reordering memory read accesses across the
-/// barrier.
-#[inline(always)]
-pub fn smp_rmb() {
-    if cfg!(CONFIG_SMP) {
-        // SAFETY: `smp_rmb()` is safe to call.
-        unsafe { bindings::smp_rmb() };
-    } else {
-        barrier();
-    }
+/// A barrier that prevents compiler and CPU from reordering memory accesses across the barrier.
+/// Does not prevent re-ordering with respect to other CPUs.
+///
+/// See [`mb`] for usage.
+#[inline]
+#[doc(alias = "dma_rmb")]
+#[doc(alias = "dma_wmb")]
+pub fn dma_mb<T: MemoryBarrier<Dma>>(_: T) {
+    T::run()
 }

-- 
2.54.0


^ permalink raw reply related	[flat|nested] 8+ messages in thread

* [PATCH v2 3/4] gpu: nova-core: fix barrier usage in CPU->GSP messaging path
  2026-06-09 15:38 [PATCH v2 0/4] rust: more memory barriers bindings Gary Guo
  2026-06-09 15:38 ` [PATCH v2 1/4] rust: sync: add helpers for mb, dma_mb and friends Gary Guo
  2026-06-09 15:38 ` [PATCH v2 2/4] rust: sync: generic memory barriers Gary Guo
@ 2026-06-09 15:38 ` Gary Guo
  2026-06-09 15:50   ` sashiko-bot
  2026-06-09 15:38 ` [PATCH v2 4/4] gpu: nova-core: fix barrier usage in GSP->CPU " Gary Guo
  3 siblings, 1 reply; 8+ messages in thread
From: Gary Guo @ 2026-06-09 15:38 UTC (permalink / raw)
  To: Miguel Ojeda, Boqun Feng, Björn Roy Baron, Benno Lossin,
	Andreas Hindborg, Alice Ryhl, Trevor Gross, Danilo Krummrich,
	Will Deacon, Peter Zijlstra, Mark Rutland, Alexandre Courbot,
	David Airlie, Simona Vetter
  Cc: Gary Guo, Alan Stern, Andrea Parri, Nicholas Piggin,
	David Howells, Jade Alglave, Luc Maranget, Paul E. McKenney,
	Akira Yokosawa, Daniel Lustig, Joel Fernandes, rust-for-linux,
	linux-kernel, nova-gpu, dri-devel, lkmm

In the CPU->GSP messaging path, the code reads the read pointer from GSP,
writes the command, advances the write pointer, and then notifies the GSP.

A LOAD->STORE ordering is needed after reading the read pointer from GSP
and writing the command. A barrier is not needed here because control
dependency provides the required ordering. Still, document it explicitly.

A STORE->STORE ordering is needed after the command write and the write
pointer advance. This is currently incorrectly done after the write pointer
advance (and before GSP notification), but this can cause issue if GSP is
still processing ring buffer, as it may observe the write pointer advance
before command write. Thus move this barrier to be before the write pointer
advance. Note that barriers are not needed between write pointer advance
and GSP notification, as MMIO accessors already carries the required
barrier.

Signed-off-by: Gary Guo <gary@garyguo.net>
---
 drivers/gpu/nova-core/gsp/cmdq.rs | 10 ++++++++++
 drivers/gpu/nova-core/gsp/fw.rs   |  3 ---
 2 files changed, 10 insertions(+), 3 deletions(-)

diff --git a/drivers/gpu/nova-core/gsp/cmdq.rs b/drivers/gpu/nova-core/gsp/cmdq.rs
index 070de0731e95..94c2790d943d 100644
--- a/drivers/gpu/nova-core/gsp/cmdq.rs
+++ b/drivers/gpu/nova-core/gsp/cmdq.rs
@@ -20,6 +20,10 @@
     ptr,
     sync::{
         aref::ARef,
+        barrier::{
+            dma_mb,
+            Write, //
+        },
         Mutex, //
     },
     time::Delta,
@@ -274,6 +278,9 @@ fn new(dev: &device::Device<device::Bound>) -> Result<Self> {
             (rx - 1, 0)
         };
 
+        // ORDERING: control dependency provides necessary LOAD->STORE ordering.
+        // `dma_mb(Full)` may be used here if we don't want to rely on control dependency.
+
         // SAFETY:
         // - `data` was created from a valid pointer, and `rx` and `tx` are in the
         //   `0..MSGQ_NUM_PAGES` range per the invariants of `cpu_write_ptr` and `gsp_read_ptr`,
@@ -443,6 +450,9 @@ fn cpu_write_ptr(&self) -> u32 {
 
     // Informs the GSP that it can process `elem_count` new pages from the command queue.
     fn advance_cpu_write_ptr(&mut self, elem_count: u32) {
+        // ORDERING: Ensure all command data is visible before updateing ring buffer pointer.
+        dma_mb(Write);
+
         super::fw::gsp_mem::advance_cpu_write_ptr(&self.0, elem_count)
     }
 }
diff --git a/drivers/gpu/nova-core/gsp/fw.rs b/drivers/gpu/nova-core/gsp/fw.rs
index 4db0cfa4dc4d..fae2eae495d1 100644
--- a/drivers/gpu/nova-core/gsp/fw.rs
+++ b/drivers/gpu/nova-core/gsp/fw.rs
@@ -91,9 +91,6 @@ pub(in crate::gsp) fn advance_cpu_write_ptr(qs: &Coherent<GspMem>, count: u32) {
         let wptr = cpu_write_ptr(qs).wrapping_add(count) % MSGQ_NUM_PAGES;
 
         dma_write!(qs, .cpuq.tx.0.writePtr, wptr);
-
-        // Ensure all command data is visible before triggering the GSP read.
-        fence(Ordering::SeqCst);
     }
 }
 

-- 
2.54.0


^ permalink raw reply related	[flat|nested] 8+ messages in thread

* [PATCH v2 4/4] gpu: nova-core: fix barrier usage in GSP->CPU messaging path
  2026-06-09 15:38 [PATCH v2 0/4] rust: more memory barriers bindings Gary Guo
                   ` (2 preceding siblings ...)
  2026-06-09 15:38 ` [PATCH v2 3/4] gpu: nova-core: fix barrier usage in CPU->GSP messaging path Gary Guo
@ 2026-06-09 15:38 ` Gary Guo
  2026-06-09 15:49   ` sashiko-bot
  3 siblings, 1 reply; 8+ messages in thread
From: Gary Guo @ 2026-06-09 15:38 UTC (permalink / raw)
  To: Miguel Ojeda, Boqun Feng, Björn Roy Baron, Benno Lossin,
	Andreas Hindborg, Alice Ryhl, Trevor Gross, Danilo Krummrich,
	Will Deacon, Peter Zijlstra, Mark Rutland, Alexandre Courbot,
	David Airlie, Simona Vetter
  Cc: Gary Guo, Alan Stern, Andrea Parri, Nicholas Piggin,
	David Howells, Jade Alglave, Luc Maranget, Paul E. McKenney,
	Akira Yokosawa, Daniel Lustig, Joel Fernandes, rust-for-linux,
	linux-kernel, nova-gpu, dri-devel, lkmm

In the CPU->GSP messaging path, the code reads the write pointer from GSP,
reads the response and advances the read pointer.

A LOAD->LOAD ordering is required after the write pointer read and the data
read. Add it as this is currently missing.

A LOAD->STORE ordering is required after the data read and the advance of
read pointer. Currently a Rust `SeqCst` barrier is used, which roughly maps
to `smp_mb(Full)`; this however does not order DMA operations (notably on
ARM, the generate barrier orders inner shareable and not outer shareable,
which is ordered by `dma_mb`). This ordering does not need to be in between
read pointer read and write, because it's for ordering between the ring
buffer data and the pointer; the RMW operation does not internally need a
barrier (nor it has to be atomic, as CPU pointers are updated by CPU only),
so move it out to cmdq.rs for clarity.

Signed-off-by: Gary Guo <gary@garyguo.net>
---
 drivers/gpu/nova-core/gsp/cmdq.rs | 8 ++++++++
 drivers/gpu/nova-core/gsp/fw.rs   | 9 ---------
 2 files changed, 8 insertions(+), 9 deletions(-)

diff --git a/drivers/gpu/nova-core/gsp/cmdq.rs b/drivers/gpu/nova-core/gsp/cmdq.rs
index 94c2790d943d..229d49a67077 100644
--- a/drivers/gpu/nova-core/gsp/cmdq.rs
+++ b/drivers/gpu/nova-core/gsp/cmdq.rs
@@ -22,6 +22,8 @@
         aref::ARef,
         barrier::{
             dma_mb,
+            Full,
+            Read,
             Write, //
         },
         Mutex, //
@@ -339,6 +341,9 @@ fn driver_write_area_size(&self) -> usize {
             (MSGQ_NUM_PAGES, tx)
         };
 
+        // ORDERING: Ensure data load is ordered after load of GSP write pointer.
+        dma_mb(Read);
+
         // SAFETY:
         // - `data` was created from a valid pointer, and `rx` and `tx` are in the
         //   `0..MSGQ_NUM_PAGES` range per the invariants of `gsp_write_ptr` and `cpu_read_ptr`,
@@ -436,6 +441,9 @@ fn cpu_read_ptr(&self) -> u32 {
 
     // Informs the GSP that it can send `elem_count` new pages into the message queue.
     fn advance_cpu_read_ptr(&mut self, elem_count: u32) {
+        // ORDERING: Ensure read pointer is properly ordered with a LOAD->STORE ordering.
+        dma_mb(Full);
+
         super::fw::gsp_mem::advance_cpu_read_ptr(&self.0, elem_count)
     }
 
diff --git a/drivers/gpu/nova-core/gsp/fw.rs b/drivers/gpu/nova-core/gsp/fw.rs
index fae2eae495d1..b213385c1f01 100644
--- a/drivers/gpu/nova-core/gsp/fw.rs
+++ b/drivers/gpu/nova-core/gsp/fw.rs
@@ -46,11 +46,6 @@
 
 // TODO: Replace with `IoView` projections once available.
 pub(super) mod gsp_mem {
-    use core::sync::atomic::{
-        fence,
-        Ordering, //
-    };
-
     use kernel::{
         dma::Coherent,
         dma_read,
@@ -76,10 +71,6 @@ pub(in crate::gsp) fn cpu_read_ptr(qs: &Coherent<GspMem>) -> u32 {
 
     pub(in crate::gsp) fn advance_cpu_read_ptr(qs: &Coherent<GspMem>, count: u32) {
         let rptr = cpu_read_ptr(qs).wrapping_add(count) % MSGQ_NUM_PAGES;
-
-        // Ensure read pointer is properly ordered.
-        fence(Ordering::SeqCst);
-
         dma_write!(qs, .cpuq.rx.0.readPtr, rptr);
     }
 

-- 
2.54.0


^ permalink raw reply related	[flat|nested] 8+ messages in thread

* Re: [PATCH v2 2/4] rust: sync: generic memory barriers
  2026-06-09 15:38 ` [PATCH v2 2/4] rust: sync: generic memory barriers Gary Guo
@ 2026-06-09 15:48   ` sashiko-bot
  0 siblings, 0 replies; 8+ messages in thread
From: sashiko-bot @ 2026-06-09 15:48 UTC (permalink / raw)
  To: Gary Guo; +Cc: Danilo Krummrich, ojeda, Alexandre Courbot, dri-devel

Thank you for your contribution! Sashiko AI review found 1 potential issue(s) to consider:
- [Low] The import statement `pub use ordering::{Full, Read, Write};` violates the vertical import style guideline.
--

commit c070c38a2a2dec94e13a6201bf19eb33e82b5478
Author: Gary Guo <gary@garyguo.net>

rust: sync: generic memory barriers

This commit implements a generic interface for memory barriers (full system,
DMA, and SMP) that requires the user to specify their intent. It provides
Read, Write, and Full orderings which map to the existing rmb(), wmb(), and
mb() barriers to reduce code duplication.

> diff --git a/rust/kernel/sync/barrier.rs b/rust/kernel/sync/barrier.rs
> index 8f2d435fcd944..54c527fdb7605 100644
> --- a/rust/kernel/sync/barrier.rs
> +++ b/rust/kernel/sync/barrier.rs

[ ... ]

> +pub mod ordering {
> +    pub use crate::sync::atomic::ordering::Full;
> +
> +    /// The annotation type for read-read barrier ordering.
> +    pub struct Read;
> +
> +    /// The annotation type for write-write barrier ordering.
> +    pub struct Write;
> +}
> +
> +pub use ordering::{Full, Read, Write};

[Severity: Low]
This isn't a bug, but does this violate the kernel's vertical import style
guidelines?

The subsystem documentation states that commits touching imports should follow
the vertical import style, which places each imported item on its own line
when braces are used to group multiple items.

-- 
Sashiko AI review · https://sashiko.dev/#/patchset/20260609-rust-barrier-v2-0-30fcc48e1cd0@garyguo.net?part=2

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH v2 4/4] gpu: nova-core: fix barrier usage in GSP->CPU messaging path
  2026-06-09 15:38 ` [PATCH v2 4/4] gpu: nova-core: fix barrier usage in GSP->CPU " Gary Guo
@ 2026-06-09 15:49   ` sashiko-bot
  0 siblings, 0 replies; 8+ messages in thread
From: sashiko-bot @ 2026-06-09 15:49 UTC (permalink / raw)
  To: Gary Guo; +Cc: dri-devel, ojeda, Danilo Krummrich, Alexandre Courbot

Thank you for your contribution! Sashiko AI review found 2 potential issue(s) to consider:

Pre-existing issues:
- [High] Early returns in `receive_msg` skip advancing the GSP read pointer, leading to a permanently stalled command queue.
- [High] CPU->GSP messaging path relies on a fragile control dependency for memory ordering instead of an explicit DMA barrier.
--

commit 4410cc89c9104f4a90dab2676e5da7ce4fd1a204
Author: Gary Guo <gary@garyguo.net>

gpu: nova-core: fix barrier usage in GSP->CPU messaging path

This commit replaces the Rust SeqCst barrier with explicit dma_mb(Read) and
dma_mb(Full) barriers in the GSP->CPU messaging path. The SeqCst barrier does
not correctly order DMA operations on architectures like ARM, so using the
proper dma_mb() ensures correct LOAD->LOAD and LOAD->STORE ordering.

> diff --git a/drivers/gpu/nova-core/gsp/cmdq.rs b/drivers/gpu/nova-core/gsp/cmdq.rs
> index 94c2790d943d0..229d49a670779 100644
> --- a/drivers/gpu/nova-core/gsp/cmdq.rs
> +++ b/drivers/gpu/nova-core/gsp/cmdq.rs

[ ... ]

> @@ -339,6 +341,9 @@ fn driver_write_area_size(&self) -> usize {
>              (MSGQ_NUM_PAGES, tx)
>          };
>  
> +        // ORDERING: Ensure data load is ordered after load of GSP write pointer.
> +        dma_mb(Read);
> +

[Severity: High]
This is a pre-existing issue, but since we are updating memory barriers here,
does the CPU->GSP messaging path in driver_write_area() also need an explicit
barrier instead of relying on a fragile control dependency?

drivers/gpu/nova-core/gsp/cmdq.rs:driver_write_area() {
    ...
        let (tail_end, wrap_end) = if rx == 0 {
            ...
        } else if rx <= tx {
            ...
        } else {
            ...
        };

        // ORDERING: control dependency provides necessary LOAD->STORE ordering.
        // dma_mb(Full) may be used here if we don't want to rely on control dependency.
    ...
}

Relying on control dependencies for DMA memory ordering is unsafe in Rust,
as the compiler does not guarantee preserving them and may optimize away
branches (e.g., using conditional selects like csel on ARM64).

Control dependencies also do not emit architectural barriers (like dmb osh
on ARM64), which are necessary to order memory accesses across the system
interconnect for device memory.

If the CPU's store to the command buffer is observed by the GSP before the
CPU's load of rx, could the CPU overwrite data the GSP has not yet consumed?

Should an explicit dma_mb(Full) be added here to guarantee LOAD->STORE
ordering?

>          // SAFETY:
>          // - `data` was created from a valid pointer, and `rx` and `tx` are in the
>          //   `0..MSGQ_NUM_PAGES` range per the invariants of `gsp_write_ptr` and `cpu_read_ptr`,

[ ... ]

> @@ -436,6 +441,9 @@ fn cpu_read_ptr(&self) -> u32 {
>  
>      // Informs the GSP that it can send `elem_count` new pages into the message queue.
>      fn advance_cpu_read_ptr(&mut self, elem_count: u32) {
> +        // ORDERING: Ensure read pointer is properly ordered with a LOAD->STORE ordering.
> +        dma_mb(Full);
> +
>          super::fw::gsp_mem::advance_cpu_read_ptr(&self.0, elem_count)
>      }

[Severity: High]
This is a pre-existing issue, but looking at how advance_cpu_read_ptr()
is used, do the early returns in receive_msg() skip advancing the GSP read
pointer?

drivers/gpu/nova-core/gsp/cmdq.rs:receive_msg() {
    ...
        let function = message.header.function().map_err(|_| EINVAL)?;

        // Extract the message. Store the result as we want to advance the read pointer even in
        // case of failure.
        let result = if function == M::FUNCTION {
            let (cmd, contents_1) = M::Message::from_bytes_prefix(message.contents.0).ok_or(EIO)?;
    ...
        // Advance the read pointer past this message.
        self.gsp_mem.advance_cpu_read_ptr(...);
    ...
}

The comments note that the result is stored to ensure the read pointer is
advanced even on failure. However, the use of the ? operator on
message.header.function() and M::Message::from_bytes_prefix() causes an
immediate return.

If an early return happens here, will subsequent polls read the exact same
malformed message?

Could this cause repeated failures and a permanently stalled command queue?

-- 
Sashiko AI review · https://sashiko.dev/#/patchset/20260609-rust-barrier-v2-0-30fcc48e1cd0@garyguo.net?part=4

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH v2 3/4] gpu: nova-core: fix barrier usage in CPU->GSP messaging path
  2026-06-09 15:38 ` [PATCH v2 3/4] gpu: nova-core: fix barrier usage in CPU->GSP messaging path Gary Guo
@ 2026-06-09 15:50   ` sashiko-bot
  0 siblings, 0 replies; 8+ messages in thread
From: sashiko-bot @ 2026-06-09 15:50 UTC (permalink / raw)
  To: Gary Guo; +Cc: Danilo Krummrich, ojeda, Alexandre Courbot, dri-devel

Thank you for your contribution! Sashiko AI review found 1 potential issue(s) to consider:

Pre-existing issues:
- [High] Missing sequence number validation allows mismatched RPC replies to be accepted after a timeout.
--

commit 0c5cf61612018cf7b218d6031a8329297e4ac9dc
Author: Gary Guo <gary@garyguo.net>

gpu: nova-core: fix barrier usage in CPU->GSP messaging path

This commit fixes the barrier usage in the CPU->GSP messaging path. It
clarifies the LOAD->STORE ordering dependency and moves the STORE->STORE
barrier to immediately before the write pointer advance to prevent the GSP
from observing an advanced pointer before the command is fully written.

> diff --git a/drivers/gpu/nova-core/gsp/cmdq.rs b/drivers/gpu/nova-core/gsp/cmdq.rs
> index 070de0731e950..94c2790d943d0 100644
> --- a/drivers/gpu/nova-core/gsp/cmdq.rs
> +++ b/drivers/gpu/nova-core/gsp/cmdq.rs

[ ... ]

[Severity: High]
This isn't a bug introduced by this patch, but while looking at the
messaging path, I noticed a potential issue with sequence number
validation.

If a previous command times out in Cmdq::send_command():

drivers/gpu/nova-core/gsp/cmdq.rs:Cmdq::send_command() {
    ...
    loop {
        match inner.receive_msg::<M::Reply>(Self::RECEIVE_TIMEOUT) {
            Ok(reply) => break Ok(reply),
            Err(ERANGE) => continue,
            Err(e) => break Err(e),
        }
    }
    ...
}

The timeout returns ETIMEDOUT and drops the command queue lock, but the
delayed reply might eventually arrive and remain in the message queue.

When another command with the same function code is subsequently sent,
CmdqInner::receive_msg() validates the incoming message:

drivers/gpu/nova-core/gsp/cmdq.rs:CmdqInner::receive_msg() {
    ...
    let result = if function == M::FUNCTION {
        let (cmd, contents_1) = M::Message::from_bytes_prefix(...
    ...
}

Does this code incorrectly accept the stale reply as a successful response
because it only checks if function == M::FUNCTION?

Could this break RPC protocol synchronization and cause the driver to
process incorrect payload data? Should the sequence number be validated
(e.g., via header.sequence()) to prevent mismatched replies?

-- 
Sashiko AI review · https://sashiko.dev/#/patchset/20260609-rust-barrier-v2-0-30fcc48e1cd0@garyguo.net?part=3

^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2026-06-09 15:50 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-06-09 15:38 [PATCH v2 0/4] rust: more memory barriers bindings Gary Guo
2026-06-09 15:38 ` [PATCH v2 1/4] rust: sync: add helpers for mb, dma_mb and friends Gary Guo
2026-06-09 15:38 ` [PATCH v2 2/4] rust: sync: generic memory barriers Gary Guo
2026-06-09 15:48   ` sashiko-bot
2026-06-09 15:38 ` [PATCH v2 3/4] gpu: nova-core: fix barrier usage in CPU->GSP messaging path Gary Guo
2026-06-09 15:50   ` sashiko-bot
2026-06-09 15:38 ` [PATCH v2 4/4] gpu: nova-core: fix barrier usage in GSP->CPU " Gary Guo
2026-06-09 15:49   ` sashiko-bot

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.