[PATCH 0/3] rust: workqueue: add safe cancellation and status methods

public inbox for rust-for-linux@vger.kernel.org
 help / color / mirror / Atom feed

* [PATCH 0/3] rust: workqueue: add safe cancellation and status methods
@ 2026-04-02  3:23 Aakash Bollineni via B4 Relay
  2026-04-02  3:23 ` [PATCH 1/3] rust: helpers: add workqueue helpers Aakash Bollineni via B4 Relay
                   ` (3 more replies)
  0 siblings, 4 replies; 7+ messages in thread
From: Aakash Bollineni via B4 Relay @ 2026-04-02  3:23 UTC (permalink / raw)
  To: rust-for-linux; +Cc: Aakash Bollineni, linux-kernel

This series modernizes the Rust workqueue infrastructure by adding
methods for status checking and safe cancellation. These capabilities
are essential for implementing safe driver teardown paths, such as in
'Drop' implementations or device removal, where background tasks must
be stopped before resources are freed.
This version (v2) addresses the feedback from Miguel Ojeda:
- The work has been split into a logical 3-patch series.
- Spurious changes (top-level .kunitconfig) have been removed.
- Initialization for DelayedWork is now handled via a robust C-helper
  (INIT_DELAYED_WORK) to ensure correct timer function registration.
- Pointer arithmetic in the reclamation path has been simplified and
  verified to be underflow-free.
Summary of changes:
1. Helpers: Adds C FFI wrappers for workqueue macros and robust
   DelayedWork initialization.
2. Core API: Implements is_pending(), cancel(), and cancel_sync() with
   guaranteed pointer reclamation and layout safety (#[repr(transparent)]).
3. Tests: Adds a comprehensive KUnit suite and a standalone stress-test
   sample.
The implementation has been verified on x86_64 using KUnit (4/4 PASSED)
and confirmed to be race-free under high-load stress testing.
Changelog v1 -> v2:
- Split into a 3-patch series to separate helpers from API and tests.
- Replaced manual timer initialization with rust_helper_init_delayed_work.
- Fixed field offset arithmetic in container_of macro path.
- Expanded KUnit tests to cover DelayedWork specifically.
- Wrapped commit messages and cover letter to 75 columns.

Signed-off-by: Aakash Bollineni <aakash.bollineni@multicorewareinc.com>
---
Aakash Bollineni (3):
      rust: helpers: add workqueue helpers
      rust: workqueue: add safe cancellation and status methods
      rust: workqueue: add KUnit and sample stress tests

 rust/helpers/workqueue.c            |  34 +++
 rust/kernel/workqueue.rs            | 467 +++++++++++++++++++++++++++++++++---
 samples/rust/Makefile               |   2 +
 samples/rust/rust_workqueue_test.rs | 214 +++++++++++++++++
 4 files changed, 682 insertions(+), 35 deletions(-)
---
base-commit: bf074eb6891be799174ff42e0051492681fdc045
change-id: 20260402-rust-next-19ba03aad3f4

Best regards,
-- 
Aakash Bollineni <aakash.bollineni@multicorewareinc.com>



^ permalink raw reply	[flat|nested] 7+ messages in thread

* [PATCH 1/3] rust: helpers: add workqueue helpers
  2026-04-02  3:23 [PATCH 0/3] rust: workqueue: add safe cancellation and status methods Aakash Bollineni via B4 Relay
@ 2026-04-02  3:23 ` Aakash Bollineni via B4 Relay
  2026-04-02  3:23 ` [PATCH 2/3] rust: workqueue: add safe cancellation and status methods Aakash Bollineni via B4 Relay
                   ` (2 subsequent siblings)
  3 siblings, 0 replies; 7+ messages in thread
From: Aakash Bollineni via B4 Relay @ 2026-04-02  3:23 UTC (permalink / raw)
  To: rust-for-linux; +Cc: Aakash Bollineni, linux-kernel

From: Aakash Bollineni <aakash.bollineni@multicorewareinc.com>

Add C-helpers to bridge the Rust workqueue abstraction with the
kernel's C workqueue macros. These helpers wrap core workqueue
functions that are either inline or macros in C, making them
accessible to Rust FFI.
New wrappers:
- rust_helper_work_pending(): Wraps work_pending().
- rust_helper_cancel_work_sync(): Wraps cancel_work_sync().
- rust_helper_cancel_delayed_work_sync(): Wraps cancel_delayed_work_sync().
- rust_helper_init_delayed_work(): Performs robust initialization of
  a delayed_work structure, ensuring the correct timer function
  (delayed_work_timer_fn) and lockdep maps are initialized using
  standard kernel macros.
These helpers are essential for supporting safe cancellation and
correct DelayedWork lifecycle management in the Rust workqueue API.

Signed-off-by: Aakash Bollineni <aakash.bollineni@multicorewareinc.com>
---
 rust/helpers/workqueue.c | 34 ++++++++++++++++++++++++++++++++++
 1 file changed, 34 insertions(+)

diff --git a/rust/helpers/workqueue.c b/rust/helpers/workqueue.c
index ce1c3a5b2150..06447bec912a 100644
--- a/rust/helpers/workqueue.c
+++ b/rust/helpers/workqueue.c
@@ -14,3 +14,37 @@ __rust_helper void rust_helper_init_work_with_key(struct work_struct *work,
 	INIT_LIST_HEAD(&work->entry);
 	work->func = func;
 }
+
+__rust_helper bool rust_helper_work_pending(struct work_struct *work)
+{
+	return work_pending(work);
+}
+
+__rust_helper bool rust_helper_cancel_work_sync(struct work_struct *work)
+{
+	return cancel_work_sync(work);
+}
+
+__rust_helper bool rust_helper_cancel_delayed_work_sync(struct delayed_work *dwork)
+{
+	return cancel_delayed_work_sync(dwork);
+}
+
+__rust_helper void rust_helper_init_delayed_work(struct delayed_work *dwork,
+						 work_func_t func,
+						 const char *name,
+						 struct lock_class_key *key,
+						 const char *tname,
+						 struct lock_class_key *tkey)
+{
+	__init_work(&dwork->work, 0);
+	dwork->work.func = func;
+	lockdep_init_map(&dwork->work.lockdep_map, name, key, 0);
+
+	timer_init_key(&dwork->timer, delayed_work_timer_fn, TIMER_IRQSAFE, tname, tkey);
+}
+
+__rust_helper void *rust_helper_get_dwork_timer_fn(void)
+{
+	return (void *)delayed_work_timer_fn;
+}

-- 
2.43.0



^ permalink raw reply related	[flat|nested] 7+ messages in thread

* [PATCH 2/3] rust: workqueue: add safe cancellation and status methods
  2026-04-02  3:23 [PATCH 0/3] rust: workqueue: add safe cancellation and status methods Aakash Bollineni via B4 Relay
  2026-04-02  3:23 ` [PATCH 1/3] rust: helpers: add workqueue helpers Aakash Bollineni via B4 Relay
@ 2026-04-02  3:23 ` Aakash Bollineni via B4 Relay
  2026-04-02  3:23 ` [PATCH 3/3] rust: workqueue: add KUnit and sample stress tests Aakash Bollineni via B4 Relay
  2026-04-02 11:19 ` [PATCH 0/3] rust: workqueue: add safe cancellation and status methods Onur Özkan
  3 siblings, 0 replies; 7+ messages in thread
From: Aakash Bollineni via B4 Relay @ 2026-04-02  3:23 UTC (permalink / raw)
  To: rust-for-linux; +Cc: Aakash Bollineni, linux-kernel

From: Aakash Bollineni <aakash.bollineni@multicorewareinc.com>

Modernize the Rust workqueue by adding methods for status checking
and cancellation of work and delayed work items.
Specifically, this patch adds:
- is_pending(): Returns true if the work item is currently enqueued.
- cancel(): Attempts to cancel the work item before it runs. If
  successful, it reclaims and returns ownership of the original
  pointer (Arc/KBox).
- cancel_sync(): Synchronously cancels the work item, waiting for it
  to finish if it's already running. Reclaims ownership if the work
  was pending.
To support safe pointer reclamation, a new "Handle-based" model is
introduced via the WorkItemPointer::reclaim method. This ensures that
the "leaked" ownership from an enqueue operation is safely recovered
without memory leaks.
Additionally, we enforce #[repr(transparent)] on core work wrappers to
ensure memory layout compatibility with C pointers and fix pointer
arithmetic in the work_container_of implementation.

Signed-off-by: Aakash Bollineni <aakash.bollineni@multicorewareinc.com>
---
 rust/kernel/workqueue.rs | 467 +++++++++++++++++++++++++++++++++++++++++++----
 1 file changed, 432 insertions(+), 35 deletions(-)

diff --git a/rust/kernel/workqueue.rs b/rust/kernel/workqueue.rs
index 706e833e9702..3e76164d103d 100644
--- a/rust/kernel/workqueue.rs
+++ b/rust/kernel/workqueue.rs
@@ -448,8 +448,26 @@ pub unsafe trait WorkItemPointer<const ID: u64>: RawWorkItem<ID> {
     /// The provided `work_struct` pointer must originate from a previous call to [`__enqueue`]
     /// where the `queue_work_on` closure returned true, and the pointer must still be valid.
     ///
+    /// The implementation must ensure that the pointer is reclaimed (e.g., via `from_raw`)
+    /// before calling the `run` method of the underlying [`WorkItem`].
+    ///
     /// [`__enqueue`]: RawWorkItem::__enqueue
     unsafe extern "C" fn run(ptr: *mut bindings::work_struct);
+
+    /// Reclaims ownership of the pointer from the work item.
+    ///
+    /// This is called when a work item is successfully cancelled, allowing the caller
+    /// to recover the original pointer (e.g., `Arc` or `KBox`) that was "leaked"
+    /// during enqueuing.
+    ///
+    /// # Safety
+    ///
+    /// The provided `work_struct` pointer must originate from a previous call to [`__enqueue`]
+    /// where the `queue_work_on` closure returned true, and the work item must have been
+    /// successfully cancelled (i.e., `cancel_work` returned true).
+    ///
+    /// [`__enqueue`]: RawWorkItem::__enqueue
+    unsafe fn reclaim(ptr: *mut bindings::work_struct) -> Self;
 }
 
 /// Defines the method that should be called when this work item is executed.
@@ -516,6 +534,156 @@ pub fn new(name: &'static CStr, key: Pin<&'static LockClassKey>) -> impl PinInit
         })
     }
 
+    /// Returns whether the work item is currently pending.
+    ///
+    /// # Warning
+    ///
+    /// This method is inherently racy. A work item can be enqueued or start running
+    /// immediately after this check returns. Do not rely on this for correctness
+    /// logic (e.g., as a guard for unsafe operations); use it only for debugging or
+    /// non-critical status reporting.
+    ///
+    /// # Examples
+    ///
+    /// ```
+    /// # use kernel::workqueue::{self, new_work, Work, WorkItem, HasWork};
+    /// # use kernel::impl_has_work;
+    /// # use kernel::sync::Arc;
+    /// # #[pin_data]
+    /// # struct MyStruct { #[pin] work: Work<MyStruct> }
+    /// # impl_has_work! { impl HasWork<Self> for MyStruct { self.work } }
+    /// # impl WorkItem for MyStruct {
+    /// #     type Pointer = Arc<MyStruct>;
+    /// #     fn run(_this: Arc<MyStruct>) {}
+    /// # }
+    /// let my_struct = Arc::pin_init(pin_init!(MyStruct {
+    ///     work <- new_work!("MyStruct::work"),
+    /// }), kernel::alloc::flags::GFP_KERNEL).unwrap();
+    /// assert!(!my_struct.work.is_pending());
+    /// workqueue::system().enqueue(my_struct.clone());
+    /// assert!(my_struct.work.is_pending());
+    /// # let _ = my_struct.work.cancel();
+    /// ```
+    #[inline]
+    pub fn is_pending(&self) -> bool {
+        // SAFETY: `self.work` is a valid pointer to a `work_struct`.
+        unsafe { bindings::work_pending(self.work.get()) }
+    }
+
+    /// Cancels the work item.
+    ///
+    /// If the work item was successfully cancelled (i.e., it was pending and had not yet
+    /// started running), the original pointer is reclaimed and returned.
+    ///
+    /// # Guarantees
+    ///
+    /// This method is non-blocking and may return while the work item is still running
+    /// on another CPU. If it returns `None`, the work item might be about to start,
+    /// is currently running, or has already finished.
+    ///
+    /// # Safety
+    ///
+    /// This is safe because ownership is only reclaimed if the kernel confirms (via
+    /// `cancel_work` returning true) that the work item is no longer in any queue and
+    /// will not be executed by the workqueue for this specific enqueue event.
+    ///
+    /// [`cancel_sync`]: Work::cancel_sync
+    ///
+    /// # Examples
+    ///
+    /// ```
+    /// # use kernel::workqueue::{self, new_work, Work, WorkItem, HasWork};
+    /// # use kernel::impl_has_work;
+    /// # use kernel::sync::Arc;
+    /// # #[pin_data]
+    /// # struct MyStruct { #[pin] work: Work<MyStruct> }
+    /// # impl_has_work! { impl HasWork<Self> for MyStruct { self.work } }
+    /// # impl WorkItem for MyStruct {
+    /// #     type Pointer = Arc<MyStruct>;
+    /// #     fn run(_this: Arc<MyStruct>) {}
+    /// # }
+    /// let my_struct = Arc::pin_init(pin_init!(MyStruct {
+    ///     work <- new_work!("MyStruct::work"),
+    /// }), kernel::alloc::flags::GFP_KERNEL).unwrap();
+    /// workqueue::system().enqueue(my_struct.clone());
+    /// assert!(my_struct.work.is_pending());
+    /// let reclaimed = my_struct.work.cancel();
+    /// assert!(reclaimed.is_some());
+    /// assert!(!my_struct.work.is_pending());
+    /// ```
+    pub fn cancel(&self) -> Option<T::Pointer>
+    where
+        T: WorkItem<ID>,
+    {
+        let work_ptr = self.work.get();
+        // SAFETY: `work_ptr` is a valid pointer to a `work_struct`.
+        if unsafe { bindings::cancel_work(work_ptr) } {
+            // SAFETY: The work item was successfully cancelled and is guaranteed not to run,
+            // so we can safely reclaim the ownership leaked during `enqueue`.
+            Some(unsafe { T::Pointer::reclaim(work_ptr) })
+        } else {
+            None
+        }
+    }
+
+    /// Synchronously cancels the work item.
+    ///
+    /// This method waits for the work item to finish if it is currently running.
+    /// If the work item was successfully cancelled from the queue, the pointer is
+    /// reclaimed and returned.
+    ///
+    /// # Guarantees
+    ///
+    /// After this method returns, the work item is guaranteed to be:
+    /// - Not pending in any queue.
+    /// - Not running on any CPU.
+    ///
+    /// This makes it safe to use during teardown (e.g., driver `remove` or object `drop`)
+    /// to ensure no background tasks are accessing resources that are about to be freed.
+    ///
+    /// # Safety
+    ///
+    /// Same as [`cancel`], it only reclaims ownership if the kernel confirms the work
+    /// was still pending and is now removed.
+    ///
+    /// [`cancel`]: Work::cancel
+    ///
+    /// # Examples
+    ///
+    /// ```
+    /// # use kernel::workqueue::{self, new_work, Work, WorkItem, HasWork};
+    /// # use kernel::impl_has_work;
+    /// # use kernel::sync::Arc;
+    /// # #[pin_data]
+    /// # struct MyStruct { #[pin] work: Work<MyStruct> }
+    /// # impl_has_work! { impl HasWork<Self> for MyStruct { self.work } }
+    /// # impl WorkItem for MyStruct {
+    /// #     type Pointer = Arc<MyStruct>;
+    /// #     fn run(_this: Arc<MyStruct>) {}
+    /// # }
+    /// let my_struct = Arc::pin_init(pin_init!(MyStruct {
+    ///     work <- new_work!("MyStruct::work"),
+    /// }), kernel::alloc::flags::GFP_KERNEL).unwrap();
+    /// workqueue::system().enqueue(my_struct.clone());
+    /// let reclaimed = my_struct.work.cancel_sync();
+    /// assert!(reclaimed.is_some());
+    /// ```
+    pub fn cancel_sync(&self) -> Option<T::Pointer>
+    where
+        T: WorkItem<ID>,
+    {
+        let work_ptr = self.work.get();
+        // SAFETY: `work_ptr` is a valid pointer to a `work_struct`.
+        if unsafe { bindings::cancel_work_sync(work_ptr) } {
+            // SAFETY: The work item was successfully cancelled/waited for, and is guaranteed
+            // not to run again unless re-enqueued. We reclaim the ownership leaked during
+            // `enqueue`.
+            Some(unsafe { T::Pointer::reclaim(work_ptr) })
+        } else {
+            None
+        }
+    }
+
     /// Get a pointer to the inner `work_struct`.
     ///
     /// # Safety
@@ -674,25 +842,14 @@ pub fn new(
         pin_init!(Self {
             dwork <- Opaque::ffi_init(|slot: *mut bindings::delayed_work| {
                 // SAFETY: The `WorkItemPointer` implementation promises that `run` can be used as
-                // the work item function.
+                // the work item function. We use the C-helper to ensure the timer function
+                // and data are initialized correctly according to kernel macros.
                 unsafe {
-                    bindings::init_work_with_key(
-                        core::ptr::addr_of_mut!((*slot).work),
+                    bindings::init_delayed_work(
+                        slot,
                         Some(T::Pointer::run),
-                        false,
                         work_name.as_char_ptr(),
                         work_key.as_ptr(),
-                    )
-                }
-
-                // SAFETY: The `delayed_work_timer_fn` function pointer can be used here because
-                // the timer is embedded in a `struct delayed_work`, and only ever scheduled via
-                // the core workqueue code, and configured to run in irqsafe context.
-                unsafe {
-                    bindings::timer_init_key(
-                        core::ptr::addr_of_mut!((*slot).timer),
-                        Some(bindings::delayed_work_timer_fn),
-                        bindings::TIMER_IRQSAFE,
                         timer_name.as_char_ptr(),
                         timer_key.as_ptr(),
                     )
@@ -702,6 +859,89 @@ pub fn new(
         })
     }
 
+    /// Returns whether the work item is currently pending.
+    ///
+    /// # Warning
+    ///
+    /// This method is inherently racy. See [`Work::is_pending`].
+    ///
+    /// # Examples
+    ///
+    /// See [`Work::is_pending`].
+    ///
+    /// [`Work::is_pending`]: Work::is_pending
+    #[inline]
+    pub fn is_pending(&self) -> bool {
+        // SAFETY: `self.dwork` is reaching into a valid Opaque<bindings::delayed_work>.
+        unsafe {
+            let ptr: *mut bindings::delayed_work = self.dwork.get();
+            bindings::work_pending(core::ptr::addr_of_mut!((*ptr).work))
+        }
+    }
+
+    /// Cancels the delayed work item.
+    ///
+    /// If the work item was successfully cancelled (i.e., it was pending and had not yet
+    /// started running), the original pointer is reclaimed and returned.
+    ///
+    /// # Guarantees
+    ///
+    /// See [`Work::cancel`].
+    ///
+    /// # Safety
+    ///
+    /// Same as [`Work::cancel`].
+    ///
+    /// [`cancel_sync`]: DelayedWork::cancel_sync
+    /// [`Work::cancel`]: Work::cancel
+    ///
+    /// # Examples
+    ///
+    /// See [`Work::cancel`].
+    pub fn cancel(&self) -> Option<T::Pointer>
+    where
+        T: WorkItem<ID>,
+    {
+        let dwork_ptr = self.dwork.get();
+        // SAFETY: `dwork_ptr` is a valid pointer to a `delayed_work`.
+        if unsafe { bindings::cancel_delayed_work(dwork_ptr) } {
+            // SAFETY: The work item was successfully cancelled and is guaranteed not to run,
+            // so we can safely reclaim the ownership leaked during `enqueue`.
+            Some(unsafe { T::Pointer::reclaim(core::ptr::addr_of_mut!((*dwork_ptr).work)) })
+        } else {
+            None
+        }
+    }
+
+    /// Synchronously cancels the delayed work item.
+    ///
+    /// This method waits for the work item to finish if it is currently running.
+    /// If the work item was successfully cancelled from the queue, the pointer is
+    /// reclaimed and returned.
+    ///
+    /// # Guarantees
+    ///
+    /// See [`Work::cancel_sync`].
+    ///
+    /// # Safety
+    ///
+    /// Same as [`Work::cancel_sync`].
+    pub fn cancel_sync(&self) -> Option<T::Pointer>
+    where
+        T: WorkItem<ID>,
+    {
+        let dwork_ptr = self.dwork.get();
+        // SAFETY: `dwork_ptr` is a valid pointer to a `delayed_work`.
+        if unsafe { bindings::cancel_delayed_work_sync(dwork_ptr) } {
+            // SAFETY: The work item was successfully cancelled/waited for, and is guaranteed
+            // not to run again unless re-enqueued. We reclaim the ownership leaked during
+            // `enqueue`.
+            Some(unsafe { T::Pointer::reclaim(core::ptr::addr_of_mut!((*dwork_ptr).work)) })
+        } else {
+            None
+        }
+    }
+
     /// Get a pointer to the inner `delayed_work`.
     ///
     /// # Safety
@@ -781,22 +1021,11 @@ unsafe fn raw_get_work(
             unsafe fn work_container_of(
                 ptr: *mut $crate::workqueue::Work<$work_type $(, $id)?>,
             ) -> *mut Self {
-                // SAFETY: The caller promises that the pointer points at a field of the right type
-                // in the right kind of struct.
-                let ptr = unsafe { $crate::workqueue::Work::raw_get(ptr) };
-
-                // SAFETY: The caller promises that the pointer points at a field of the right type
-                // in the right kind of struct.
-                let delayed_work = unsafe {
-                    $crate::container_of!(ptr, $crate::bindings::delayed_work, work)
-                };
-
-                let delayed_work: *mut $crate::workqueue::DelayedWork<$work_type $(, $id)?> =
-                    delayed_work.cast();
-
-                // SAFETY: The caller promises that the pointer points at a field of the right type
-                // in the right kind of struct.
-                unsafe { $crate::container_of!(delayed_work, Self, $field) }
+                // SAFETY: The caller promises that the pointer points at the `work` field
+                // of a `delayed_work` struct, which is itself the `dwork` field of a
+                // `DelayedWork` wrapper, which is the `$field` field of a `Self` struct.
+                let ptr = ptr.cast::<$crate::workqueue::DelayedWork<$work_type $(, $id)?>>();
+                unsafe { $crate::container_of!(ptr, Self, $field) }
             }
         }
     )*};
@@ -827,6 +1056,15 @@ unsafe impl<T, const ID: u64> WorkItemPointer<ID> for Arc<T>
 
         T::run(arc)
     }
+
+    unsafe fn reclaim(ptr: *mut bindings::work_struct) -> Self {
+        // The `__enqueue` method always uses a `work_struct` stored in a `Work<T, ID>`.
+        let ptr = ptr.cast::<Work<T, ID>>();
+        // SAFETY: This computes the pointer that `__enqueue` got from `Arc::into_raw`.
+        let ptr = unsafe { T::work_container_of(ptr) };
+        // SAFETY: This pointer comes from `Arc::into_raw` and we've been given back ownership.
+        unsafe { Arc::from_raw(ptr) }
+    }
 }
 
 // SAFETY: The `work_struct` raw pointer is guaranteed to be valid for the duration of the call to
@@ -874,7 +1112,8 @@ unsafe impl<T, const ID: u64> RawDelayedWorkItem<ID> for Arc<T>
 {
 }
 
-// SAFETY: TODO.
+// SAFETY: The `WorkItemPointer` implementation for `Pin<KBox<T>>` is safe because `KBox::from_raw`
+// correctly reconstructs the box that was leaked during `enqueue` (via `KBox::into_raw`).
 unsafe impl<T, const ID: u64> WorkItemPointer<ID> for Pin<KBox<T>>
 where
     T: WorkItem<ID, Pointer = Self>,
@@ -883,18 +1122,35 @@ unsafe impl<T, const ID: u64> WorkItemPointer<ID> for Pin<KBox<T>>
     unsafe extern "C" fn run(ptr: *mut bindings::work_struct) {
         // The `__enqueue` method always uses a `work_struct` stored in a `Work<T, ID>`.
         let ptr = ptr.cast::<Work<T, ID>>();
-        // SAFETY: This computes the pointer that `__enqueue` got from `Arc::into_raw`.
+        // SAFETY: This computes the pointer that `__enqueue` got from `KBox::into_raw`.
         let ptr = unsafe { T::work_container_of(ptr) };
-        // SAFETY: This pointer comes from `Arc::into_raw` and we've been given back ownership.
+        // SAFETY: This pointer comes from `KBox::into_raw` and we've been given back ownership.
         let boxed = unsafe { KBox::from_raw(ptr) };
         // SAFETY: The box was already pinned when it was enqueued.
         let pinned = unsafe { Pin::new_unchecked(boxed) };
 
         T::run(pinned)
     }
+
+    unsafe fn reclaim(ptr: *mut bindings::work_struct) -> Self {
+        // The `__enqueue` method always uses a `work_struct` stored in a `Work<T, ID>`.
+        let ptr = ptr.cast::<Work<T, ID>>();
+        // SAFETY: This computes the pointer that `__enqueue` got from `KBox::into_raw`.
+        let ptr = unsafe { T::work_container_of(ptr) };
+        // SAFETY: This pointer comes from `KBox::into_raw` and we've been given back ownership.
+        let boxed = unsafe { KBox::from_raw(ptr) };
+        // SAFETY: The box was already pinned when it was enqueued.
+        unsafe { Pin::new_unchecked(boxed) }
+    }
 }
 
-// SAFETY: TODO.
+// SAFETY: The `work_struct` raw pointer is guaranteed to be valid for the duration of the call to
+// the closure because we have exclusive ownership of the `KBox`, and we don't drop it ourselves.
+// If `queue_work_on` returns true, it is further guaranteed to be valid until a call to the
+// function pointer in `work_struct` because we leak the memory it points to, and only reclaim it
+// if the closure returns false (not reachable for KBox as it must succeed) or in
+// `WorkItemPointer::run`, which is what the function pointer in the `work_struct` must be
+// pointing to.
 unsafe impl<T, const ID: u64> RawWorkItem<ID> for Pin<KBox<T>>
 where
     T: WorkItem<ID, Pointer = Self>,
@@ -1022,3 +1278,144 @@ pub fn system_bh_highpri() -> &'static Queue {
     // SAFETY: `system_bh_highpri_wq` is a C global, always available.
     unsafe { Queue::from_raw(bindings::system_bh_highpri_wq) }
 }
+
+#[macros::kunit_tests(rust_kernel_workqueue)]
+mod tests {
+    use super::*;
+    use crate::sync::Arc;
+
+    #[pin_data]
+    struct TestWorkItem {
+        #[pin]
+        work: Work<TestWorkItem>,
+        value: i32,
+    }
+
+    impl_has_work! {
+        impl HasWork<Self> for TestWorkItem { self.work }
+    }
+
+    impl WorkItem for TestWorkItem {
+        type Pointer = Arc<Self>;
+        fn run(_this: Arc<Self>) {}
+    }
+
+    #[pin_data]
+    struct TestDelayedWorkItem {
+        #[pin]
+        delayed_work: DelayedWork<TestDelayedWorkItem>,
+        value: i32,
+    }
+
+    impl_has_delayed_work! {
+        impl HasDelayedWork<Self> for TestDelayedWorkItem { self.delayed_work }
+    }
+
+    impl WorkItem for TestDelayedWorkItem {
+        type Pointer = Arc<Self>;
+        fn run(_this: Arc<Self>) {}
+    }
+
+    /// Helper to get Arc strong count for verification in tests.
+    fn get_arc_count<T: Sized>(arc: &Arc<T>) -> i32 {
+        // SAFETY: ArcInner has refcount as its first field. Arc points to data at DATA_OFFSET.
+        unsafe {
+            let ptr = Arc::as_ptr(arc);
+            let inner_ptr = (ptr as *const u8).sub(Arc::<T>::DATA_OFFSET);
+            let refcount_ptr = inner_ptr as *const i32;
+            core::ptr::read_volatile(refcount_ptr)
+        }
+    }
+
+    #[test]
+    fn test_work_cancel_reclaim() {
+        let item = Arc::pin_init(
+            pin_init!(TestWorkItem {
+                work <- new_work!("TestWorkItem::work"),
+                value: 42,
+            }),
+            GFP_KERNEL,
+        )
+        .expect("Failed to allocate TestWorkItem");
+
+        let initial_count = get_arc_count(&item);
+
+        // Enqueue
+        let _ = system().enqueue(item.clone());
+
+        // Cancel and Reclaim (if it was pending)
+        if let Some(reclaimed) = item.work.cancel() {
+            assert!(get_arc_count(&item) == initial_count + 1);
+            drop(reclaimed);
+            assert!(get_arc_count(&item) == initial_count);
+        }
+    }
+
+    #[test]
+    fn test_work_cancel_sync_reclaim() {
+        let item = Arc::pin_init(
+            pin_init!(TestWorkItem {
+                work <- new_work!("TestWorkItem::work"),
+                value: 42,
+            }),
+            GFP_KERNEL,
+        )
+        .expect("Failed to allocate TestWorkItem");
+
+        let initial_count = get_arc_count(&item);
+
+        // Enqueue
+        let _ = system().enqueue(item.clone());
+
+        // Cancel Sync and Reclaim
+        if let Some(reclaimed) = item.work.cancel_sync() {
+            assert!(get_arc_count(&item) == initial_count + 1);
+            drop(reclaimed);
+            assert!(get_arc_count(&item) == initial_count);
+        }
+    }
+
+    #[test]
+    fn test_work_stress_enqueue_cancel() {
+        let item = Arc::pin_init(
+            pin_init!(TestWorkItem {
+                work <- new_work!("TestWorkItem::work"),
+                value: 42,
+            }),
+            GFP_KERNEL,
+        )
+        .expect("Failed to allocate TestWorkItem");
+
+        for _ in 0..100 {
+            if let Ok(_) = system().enqueue(item.clone()) {
+                let _ = item.work.cancel_sync();
+            }
+        }
+
+        assert_eq!(get_arc_count(&item), 1);
+    }
+
+    #[test]
+    fn test_delayed_work_cancel_reclaim() {
+        let item = Arc::pin_init(
+            pin_init!(TestDelayedWorkItem {
+                delayed_work <- new_delayed_work!("TestDelayedWorkItem::delayed_work"),
+                value: 42,
+            }),
+            GFP_KERNEL,
+        )
+        .expect("Failed to allocate TestDelayedWorkItem");
+
+        let initial_count = get_arc_count(&item);
+
+        // Enqueue delayed
+        let _ = system().enqueue_delayed(item.clone(), 100);
+
+        // Cancel and Reclaim
+        if let Some(reclaimed) = item.delayed_work.cancel() {
+            assert!(get_arc_count(&item) == initial_count + 1);
+            drop(reclaimed);
+            assert!(get_arc_count(&item) == initial_count);
+        }
+    }
+}

-- 
2.43.0



^ permalink raw reply related	[flat|nested] 7+ messages in thread

* [PATCH 3/3] rust: workqueue: add KUnit and sample stress tests
  2026-04-02  3:23 [PATCH 0/3] rust: workqueue: add safe cancellation and status methods Aakash Bollineni via B4 Relay
  2026-04-02  3:23 ` [PATCH 1/3] rust: helpers: add workqueue helpers Aakash Bollineni via B4 Relay
  2026-04-02  3:23 ` [PATCH 2/3] rust: workqueue: add safe cancellation and status methods Aakash Bollineni via B4 Relay
@ 2026-04-02  3:23 ` Aakash Bollineni via B4 Relay
  2026-04-02 11:23   ` Onur Özkan
  2026-04-02 11:34   ` Onur Özkan
  2026-04-02 11:19 ` [PATCH 0/3] rust: workqueue: add safe cancellation and status methods Onur Özkan
  3 siblings, 2 replies; 7+ messages in thread
From: Aakash Bollineni via B4 Relay @ 2026-04-02  3:23 UTC (permalink / raw)
  To: rust-for-linux; +Cc: Aakash Bollineni, linux-kernel

From: Aakash Bollineni <aakash.bollineni@multicorewareinc.com>

To ensure the safety and correctness of the improved workqueue API,
this patch adds comprehensive testing infrastructure:
1. KUnit Tests: Adds an internal 'rust_kernel_workqueue' test suite
   to rust/kernel/workqueue.rs. These tests verify basic and
   synchronous cancellation, refcount stability, and concurrency
   safety for both standard and delayed work.
2. Sample Module: Adds samples/rust/rust_workqueue_test.rs as a
   standalone module that performs a 1000-iteration stress test
   designed to verify race-free pointer handover during concurrent
   enqueue/cancel operations.
The tests use distinct types for standard and delayed work items to
ensure clear trait dispatch and prevent field offset conflicts.

Signed-off-by: Aakash Bollineni <aakash.bollineni@multicorewareinc.com>
---
 samples/rust/Makefile               |   2 +
 samples/rust/rust_workqueue_test.rs | 214 ++++++++++++++++++++++++++++++++++++
 2 files changed, 216 insertions(+)

diff --git a/samples/rust/Makefile b/samples/rust/Makefile
index 6c0aaa58cccc..0f304bd90997 100644
--- a/samples/rust/Makefile
+++ b/samples/rust/Makefile
@@ -20,3 +20,5 @@ obj-$(CONFIG_SAMPLE_RUST_SOC)			+= rust_soc.o
 rust_print-y := rust_print_main.o rust_print_events.o
 
 subdir-$(CONFIG_SAMPLE_RUST_HOSTPROGS)		+= hostprogs
+
+obj-y += rust_workqueue_test.o
diff --git a/samples/rust/rust_workqueue_test.rs b/samples/rust/rust_workqueue_test.rs
new file mode 100644
index 000000000000..c055ec1964c0
--- /dev/null
+++ b/samples/rust/rust_workqueue_test.rs
@@ -0,0 +1,214 @@
+// SPDX-License-Identifier: GPL-2.0
+
+//! Robust stress test for Rust workqueue API.
+
+use kernel::prelude::*;
+use kernel::sync::Arc;
+use kernel::time::msecs_to_jiffies;
+use kernel::workqueue::{self, new_work, Work, WorkItem};
+
+#[pin_data]
+struct TestItem {
+    #[pin]
+    work: Work<TestItem>,
+    value: i32,
+    #[pin]
+    delayed_work: workqueue::DelayedWork<TestItem>,
+}
+
+kernel::impl_has_work! {
+    impl HasWork<Self> for TestItem { self.work }
+}
+
+// SAFETY: The `delayed_work` field is at a fixed offset and is valid for the lifetime of
+// `TestItem`.
+unsafe impl workqueue::HasDelayedWork<Self> for TestItem {}
+
+impl WorkItem for TestItem {
+    type Pointer = Arc<TestItem>;
+
+    fn run(this: Arc<TestItem>) {
+        pr_info!(
+            "Rust workqueue test: Work item running (value: {})\n",
+            this.value
+        );
+    }
+}
+
+/// Helper to get Arc strong count for verification in tests.
+/// This uses internal layout knowledge of Arc.
+fn get_arc_count<T: Sized>(arc: &Arc<T>) -> i32 {
+    // SAFETY: ArcInner has refcount as its first field. Arc points to data at DATA_OFFSET.
+    unsafe {
+        let ptr = Arc::as_ptr(arc);
+        let inner_ptr = (ptr as *const u8).sub(Arc::<T>::DATA_OFFSET);
+        // The first field of ArcInner is Refcount, which is a transparent wrapper around
+        // refcount_t. In the kernel, refcount_t is an atomic_t (i32).
+        let refcount_ptr = inner_ptr as *const i32;
+        // We use a relaxed load to get the current value.
+        core::ptr::read_volatile(refcount_ptr)
+    }
+}
+
+struct RustWorkqueueTest;
+
+impl kernel::Module for RustWorkqueueTest {
+    fn init(_module: &'static ThisModule) -> Result<Self> {
+        pr_info!("Rust workqueue test: starting robust verification\n");
+
+        // 1. Basic Lifecycle with Refcount Validation
+        {
+            let test_item = Arc::pin_init(
+                pin_init!(TestItem {
+                    work <- new_work!("TestItem::work"),
+                    value: 42,
+                    delayed_work <- workqueue::new_delayed_work!("TestItem::delayed_work"),
+                }),
+                GFP_KERNEL,
+            )?;
+
+            let initial_count = get_arc_count(&test_item);
+            pr_info!("Initial Arc strong count: {}\n", initial_count);
+
+            // Enqueue
+            let enqueued_item = test_item.clone();
+
+            if let Err(returned_item) = workqueue::system().enqueue(enqueued_item) {
+                pr_warn!("Work already pending, unexpected!\n");
+                let _ = returned_item;
+            } else {
+                pr_info!(
+                    "Work enqueued successfully. Strong count: {}\n",
+                    get_arc_count(&test_item)
+                );
+            }
+
+            // Cancel immediately (best effort)
+            if let Some(reclaimed) = test_item.work.cancel() {
+                let count_after_cancel = get_arc_count(&test_item);
+                pr_info!(
+                    "Success: Work cancelled and Arc reclaimed. Strong count: {}\n",
+                    count_after_cancel
+                );
+
+                // VALIDATION: Reclamation must restore the refcount (minus the clone we just
+                // reclaimed)
+                if count_after_cancel != initial_count + 1 {
+                    pr_err!(
+                        "ERROR: Refcount mismatch after cancel! Expected {}, got {}\n",
+                        initial_count + 1,
+                        count_after_cancel
+                    );
+                    return Err(ENXIO);
+                }
+                drop(reclaimed);
+                if get_arc_count(&test_item) != initial_count {
+                    pr_err!(
+                        "ERROR: Refcount mismatch after drop! Expected {}, got {}\n",
+                        initial_count,
+                        get_arc_count(&test_item)
+                    );
+                    return Err(ENXIO);
+                }
+            } else {
+                pr_info!("Work already running or finished, could not reclaim via cancel().\n");
+            }
+        }
+
+        // 2. Stress Testing: Enqueue/Cancel Sync Loop
+        {
+            pr_info!("Starting stress test (1000 iterations)...\n");
+            let test_item = Arc::pin_init(
+                pin_init!(TestItem {
+                    work <- new_work!("TestItem::work"),
+                    value: 99,
+                    delayed_work <- workqueue::new_delayed_work!("TestItem::delayed_work"),
+                }),
+                GFP_KERNEL,
+            )?;
+
+            for i in 0..1000 {
+                let _ = workqueue::system().enqueue(test_item.clone());
+                // Use cancel_sync to ensure determinism for the next iteration
+                let _ = test_item.work.cancel_sync();
+
+                if i % 250 == 0 {
+                    pr_info!("Stress test progress: {}/1000\n", i);
+                }
+            }
+
+            if get_arc_count(&test_item) != 1 {
+                pr_err!(
+                    "ERROR: Refcount leak detected after stress test! count: {}\n",
+                    get_arc_count(&test_item)
+                );
+                return Err(ENXIO);
+            } else {
+                pr_info!("Stress test completed successfully. No refcount leaks.\n");
+            }
+        }
+
+        // 3. Delayed Work Cancellation Test
+        {
+            let test_item = Arc::pin_init(
+                pin_init!(TestItem {
+                    work <- new_work!("TestItem::work"),
+                    value: 7,
+                    delayed_work <- workqueue::new_delayed_work!("TestItem::delayed_work"),
+                }),
+                GFP_KERNEL,
+            )?;
+
+            let initial_count = get_arc_count(&test_item);
+
+            // Schedule with a long delay
+            let to_enqueue = test_item.clone();
+            let res = workqueue::system()
+                .enqueue_delayed(to_enqueue, msecs_to_jiffies(5000));
+            if let Err(returned) = res {
+                pr_warn!("Delayed work already pending, returned item.\n");
+                drop(returned);
+            } else {
+                pr_info!("Delayed work enqueued. count: {}\n", get_arc_count(&test_item));
+            }
+
+            if test_item.delayed_work.is_pending() {
+                pr_info!("Delayed work is pending as expected.\n");
+            }
+
+            if let Some(reclaimed) = test_item.delayed_work.cancel() {
+                pr_info!("Success: Delayed work reclaimed. No leak.\n");
+                drop(reclaimed);
+            } else {
+                pr_warn!("Notice: Delayed work not reclaimed (running or never enqueued).\n");
+            }
+
+            let final_count = get_arc_count(&test_item);
+            if final_count != initial_count {
+                pr_err!(
+                    "ERROR: Refcount leak after delayed cancel! expected {}, got {}\n",
+                    initial_count,
+                    final_count
+                );
+                return Err(ENXIO);
+            }
+        }
+
+        pr_info!("Rust workqueue test: all robust checks passed\n");
+        Ok(RustWorkqueueTest)
+    }
+}
+
+impl Drop for RustWorkqueueTest {
+    fn drop(&mut self) {
+        pr_info!("Rust workqueue test: exit\n");
+    }
+}
+
+module! {
+    type: RustWorkqueueTest,
+    name: "rust_workqueue_test",
+    authors: ["Aakash Bollineni"],
+    description: "Robust stress test for Rust workqueue API",
+    license: "GPL",
+}

-- 
2.43.0



^ permalink raw reply related	[flat|nested] 7+ messages in thread

* Re: [PATCH 0/3] rust: workqueue: add safe cancellation and status methods
  2026-04-02  3:23 [PATCH 0/3] rust: workqueue: add safe cancellation and status methods Aakash Bollineni via B4 Relay
                   ` (2 preceding siblings ...)
  2026-04-02  3:23 ` [PATCH 3/3] rust: workqueue: add KUnit and sample stress tests Aakash Bollineni via B4 Relay
@ 2026-04-02 11:19 ` Onur Özkan
  3 siblings, 0 replies; 7+ messages in thread
From: Onur Özkan @ 2026-04-02 11:19 UTC (permalink / raw)
  To: Aakash Bollineni via B4 Relay
  Cc: aakash.bollineni, rust-for-linux, linux-kernel

Hi Aakash,

On Thu, 02 Apr 2026 08:53:45 +0530
Aakash Bollineni via B4 Relay
<devnull+aakash.bollineni.multicorewareinc.com@kernel.org> wrote:

> This series modernizes the Rust workqueue infrastructure by adding
> methods for status checking and safe cancellation. These capabilities
> are essential for implementing safe driver teardown paths, such as in
> 'Drop' implementations or device removal, where background tasks must
> be stopped before resources are freed.
> This version (v2) addresses the feedback from Miguel Ojeda:

I thought this series was v1 because it doesn't have the version
prefix. Please make sure to include the version tag in your subject
line (e.g., [PATCH v2 0/3]) so maintainers can track the revisions
properly.

Thanks,
Onur

> - The work has been split into a logical 3-patch series.
> - Spurious changes (top-level .kunitconfig) have been removed.
> - Initialization for DelayedWork is now handled via a robust C-helper
>   (INIT_DELAYED_WORK) to ensure correct timer function registration.
> - Pointer arithmetic in the reclamation path has been simplified and
>   verified to be underflow-free.
> Summary of changes:
> 1. Helpers: Adds C FFI wrappers for workqueue macros and robust
>    DelayedWork initialization.
> 2. Core API: Implements is_pending(), cancel(), and cancel_sync() with
>    guaranteed pointer reclamation and layout safety
> (#[repr(transparent)]). 3. Tests: Adds a comprehensive KUnit suite
> and a standalone stress-test sample.
> The implementation has been verified on x86_64 using KUnit (4/4
> PASSED) and confirmed to be race-free under high-load stress testing.
> Changelog v1 -> v2:
> - Split into a 3-patch series to separate helpers from API and tests.
> - Replaced manual timer initialization with
> rust_helper_init_delayed_work.
> - Fixed field offset arithmetic in container_of macro path.
> - Expanded KUnit tests to cover DelayedWork specifically.
> - Wrapped commit messages and cover letter to 75 columns.
> 
> Signed-off-by: Aakash Bollineni
> <aakash.bollineni@multicorewareinc.com> ---
> Aakash Bollineni (3):
>       rust: helpers: add workqueue helpers
>       rust: workqueue: add safe cancellation and status methods
>       rust: workqueue: add KUnit and sample stress tests
> 
>  rust/helpers/workqueue.c            |  34 +++
>  rust/kernel/workqueue.rs            | 467
> +++++++++++++++++++++++++++++++++--- samples/rust/Makefile
>    |   2 + samples/rust/rust_workqueue_test.rs | 214 +++++++++++++++++
>  4 files changed, 682 insertions(+), 35 deletions(-)
> ---
> base-commit: bf074eb6891be799174ff42e0051492681fdc045
> change-id: 20260402-rust-next-19ba03aad3f4
> 
> Best regards,


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH 3/3] rust: workqueue: add KUnit and sample stress tests
  2026-04-02  3:23 ` [PATCH 3/3] rust: workqueue: add KUnit and sample stress tests Aakash Bollineni via B4 Relay
@ 2026-04-02 11:23   ` Onur Özkan
  2026-04-02 11:34   ` Onur Özkan
  1 sibling, 0 replies; 7+ messages in thread
From: Onur Özkan @ 2026-04-02 11:23 UTC (permalink / raw)
  To: Aakash Bollineni via B4 Relay
  Cc: aakash.bollineni, rust-for-linux, linux-kernel

On Thu, 02 Apr 2026 08:53:48 +0530
Aakash Bollineni via B4 Relay
<devnull+aakash.bollineni.multicorewareinc.com@kernel.org> wrote:

> From: Aakash Bollineni <aakash.bollineni@multicorewareinc.com>
> 
> To ensure the safety and correctness of the improved workqueue API,
> this patch adds comprehensive testing infrastructure:
> 1. KUnit Tests: Adds an internal 'rust_kernel_workqueue' test suite
>    to rust/kernel/workqueue.rs. These tests verify basic and
>    synchronous cancellation, refcount stability, and concurrency
>    safety for both standard and delayed work.
> 2. Sample Module: Adds samples/rust/rust_workqueue_test.rs as a
>    standalone module that performs a 1000-iteration stress test
>    designed to verify race-free pointer handover during concurrent
>    enqueue/cancel operations.
> The tests use distinct types for standard and delayed work items to
> ensure clear trait dispatch and prevent field offset conflicts.
> 
> Signed-off-by: Aakash Bollineni
> <aakash.bollineni@multicorewareinc.com> ---
>  samples/rust/Makefile               |   2 +
>  samples/rust/rust_workqueue_test.rs | 214
> ++++++++++++++++++++++++++++++++++++ 2 files changed, 216
> insertions(+)
> 
> diff --git a/samples/rust/Makefile b/samples/rust/Makefile
> index 6c0aaa58cccc..0f304bd90997 100644
> --- a/samples/rust/Makefile
> +++ b/samples/rust/Makefile
> @@ -20,3 +20,5 @@ obj-$(CONFIG_SAMPLE_RUST_SOC)
> 	+= rust_soc.o rust_print-y := rust_print_main.o
> rust_print_events.o 
>  subdir-$(CONFIG_SAMPLE_RUST_HOSTPROGS)		+= hostprogs
> +
> +obj-y += rust_workqueue_test.o
> diff --git a/samples/rust/rust_workqueue_test.rs
> b/samples/rust/rust_workqueue_test.rs new file mode 100644
> index 000000000000..c055ec1964c0
> --- /dev/null
> +++ b/samples/rust/rust_workqueue_test.rs
> @@ -0,0 +1,214 @@
> +// SPDX-License-Identifier: GPL-2.0
> +
> +//! Robust stress test for Rust workqueue API.
> +
> +use kernel::prelude::*;
> +use kernel::sync::Arc;
> +use kernel::time::msecs_to_jiffies;
> +use kernel::workqueue::{self, new_work, Work, WorkItem};
> +
> +#[pin_data]
> +struct TestItem {
> +    #[pin]
> +    work: Work<TestItem>,
> +    value: i32,
> +    #[pin]
> +    delayed_work: workqueue::DelayedWork<TestItem>,
> +}
> +
> +kernel::impl_has_work! {
> +    impl HasWork<Self> for TestItem { self.work }
> +}
> +
> +// SAFETY: The `delayed_work` field is at a fixed offset and is
> valid for the lifetime of +// `TestItem`.
> +unsafe impl workqueue::HasDelayedWork<Self> for TestItem {}
> +
> +impl WorkItem for TestItem {
> +    type Pointer = Arc<TestItem>;
> +
> +    fn run(this: Arc<TestItem>) {
> +        pr_info!(
> +            "Rust workqueue test: Work item running (value: {})\n",
> +            this.value
> +        );
> +    }
> +}
> +
> +/// Helper to get Arc strong count for verification in tests.
> +/// This uses internal layout knowledge of Arc.
> +fn get_arc_count<T: Sized>(arc: &Arc<T>) -> i32 {
> +    // SAFETY: ArcInner has refcount as its first field. Arc points
> to data at DATA_OFFSET.
> +    unsafe {
> +        let ptr = Arc::as_ptr(arc);
> +        let inner_ptr = (ptr as *const
> u8).sub(Arc::<T>::DATA_OFFSET);
> +        // The first field of ArcInner is Refcount, which is a
> transparent wrapper around
> +        // refcount_t. In the kernel, refcount_t is an atomic_t
> (i32).
> +        let refcount_ptr = inner_ptr as *const i32;
> +        // We use a relaxed load to get the current value.
> +        core::ptr::read_volatile(refcount_ptr)
> +    }
> +}

Patch 2 also adds this function. Can we move it to a shared location to
avoid duplication? They are identical in behavior except this version
has extra explanatory comments (which is a nice addition).

> +
> +struct RustWorkqueueTest;
> +
> +impl kernel::Module for RustWorkqueueTest {
> +    fn init(_module: &'static ThisModule) -> Result<Self> {
> +        pr_info!("Rust workqueue test: starting robust
> verification\n"); +
> +        // 1. Basic Lifecycle with Refcount Validation
> +        {
> +            let test_item = Arc::pin_init(
> +                pin_init!(TestItem {
> +                    work <- new_work!("TestItem::work"),
> +                    value: 42,
> +                    delayed_work <-
> workqueue::new_delayed_work!("TestItem::delayed_work"),
> +                }),
> +                GFP_KERNEL,
> +            )?;
> +
> +            let initial_count = get_arc_count(&test_item);
> +            pr_info!("Initial Arc strong count: {}\n",
> initial_count); +
> +            // Enqueue
> +            let enqueued_item = test_item.clone();
> +
> +            if let Err(returned_item) =
> workqueue::system().enqueue(enqueued_item) {
> +                pr_warn!("Work already pending, unexpected!\n");
> +                let _ = returned_item;
> +            } else {
> +                pr_info!(
> +                    "Work enqueued successfully. Strong count: {}\n",
> +                    get_arc_count(&test_item)
> +                );
> +            }
> +
> +            // Cancel immediately (best effort)
> +            if let Some(reclaimed) = test_item.work.cancel() {
> +                let count_after_cancel = get_arc_count(&test_item);
> +                pr_info!(
> +                    "Success: Work cancelled and Arc reclaimed.
> Strong count: {}\n",
> +                    count_after_cancel
> +                );
> +
> +                // VALIDATION: Reclamation must restore the refcount
> (minus the clone we just
> +                // reclaimed)
> +                if count_after_cancel != initial_count + 1 {
> +                    pr_err!(
> +                        "ERROR: Refcount mismatch after cancel!
> Expected {}, got {}\n",
> +                        initial_count + 1,
> +                        count_after_cancel
> +                    );
> +                    return Err(ENXIO);
> +                }
> +                drop(reclaimed);
> +                if get_arc_count(&test_item) != initial_count {
> +                    pr_err!(
> +                        "ERROR: Refcount mismatch after drop!
> Expected {}, got {}\n",
> +                        initial_count,
> +                        get_arc_count(&test_item)
> +                    );
> +                    return Err(ENXIO);
> +                }
> +            } else {
> +                pr_info!("Work already running or finished, could
> not reclaim via cancel().\n");
> +            }
> +        }
> +
> +        // 2. Stress Testing: Enqueue/Cancel Sync Loop
> +        {
> +            pr_info!("Starting stress test (1000 iterations)...\n");
> +            let test_item = Arc::pin_init(
> +                pin_init!(TestItem {
> +                    work <- new_work!("TestItem::work"),
> +                    value: 99,
> +                    delayed_work <-
> workqueue::new_delayed_work!("TestItem::delayed_work"),
> +                }),
> +                GFP_KERNEL,
> +            )?;
> +
> +            for i in 0..1000 {
> +                let _ =
> workqueue::system().enqueue(test_item.clone());
> +                // Use cancel_sync to ensure determinism for the
> next iteration
> +                let _ = test_item.work.cancel_sync();
> +
> +                if i % 250 == 0 {
> +                    pr_info!("Stress test progress: {}/1000\n", i);
> +                }
> +            }
> +
> +            if get_arc_count(&test_item) != 1 {
> +                pr_err!(
> +                    "ERROR: Refcount leak detected after stress
> test! count: {}\n",
> +                    get_arc_count(&test_item)
> +                );
> +                return Err(ENXIO);
> +            } else {
> +                pr_info!("Stress test completed successfully. No
> refcount leaks.\n");
> +            }
> +        }
> +
> +        // 3. Delayed Work Cancellation Test
> +        {
> +            let test_item = Arc::pin_init(
> +                pin_init!(TestItem {
> +                    work <- new_work!("TestItem::work"),
> +                    value: 7,
> +                    delayed_work <-
> workqueue::new_delayed_work!("TestItem::delayed_work"),
> +                }),
> +                GFP_KERNEL,
> +            )?;
> +
> +            let initial_count = get_arc_count(&test_item);
> +
> +            // Schedule with a long delay
> +            let to_enqueue = test_item.clone();
> +            let res = workqueue::system()
> +                .enqueue_delayed(to_enqueue, msecs_to_jiffies(5000));
> +            if let Err(returned) = res {
> +                pr_warn!("Delayed work already pending, returned
> item.\n");
> +                drop(returned);
> +            } else {
> +                pr_info!("Delayed work enqueued. count: {}\n",
> get_arc_count(&test_item));
> +            }
> +
> +            if test_item.delayed_work.is_pending() {
> +                pr_info!("Delayed work is pending as expected.\n");
> +            }
> +
> +            if let Some(reclaimed) = test_item.delayed_work.cancel()
> {
> +                pr_info!("Success: Delayed work reclaimed. No
> leak.\n");
> +                drop(reclaimed);
> +            } else {
> +                pr_warn!("Notice: Delayed work not reclaimed
> (running or never enqueued).\n");
> +            }
> +
> +            let final_count = get_arc_count(&test_item);
> +            if final_count != initial_count {
> +                pr_err!(
> +                    "ERROR: Refcount leak after delayed cancel!
> expected {}, got {}\n",
> +                    initial_count,
> +                    final_count
> +                );
> +                return Err(ENXIO);
> +            }
> +        }
> +
> +        pr_info!("Rust workqueue test: all robust checks passed\n");
> +        Ok(RustWorkqueueTest)
> +    }
> +}
> +
> +impl Drop for RustWorkqueueTest {
> +    fn drop(&mut self) {
> +        pr_info!("Rust workqueue test: exit\n");
> +    }
> +}
> +
> +module! {
> +    type: RustWorkqueueTest,
> +    name: "rust_workqueue_test",
> +    authors: ["Aakash Bollineni"],
> +    description: "Robust stress test for Rust workqueue API",
> +    license: "GPL",
> +}
> 


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH 3/3] rust: workqueue: add KUnit and sample stress tests
  2026-04-02  3:23 ` [PATCH 3/3] rust: workqueue: add KUnit and sample stress tests Aakash Bollineni via B4 Relay
  2026-04-02 11:23   ` Onur Özkan
@ 2026-04-02 11:34   ` Onur Özkan
  1 sibling, 0 replies; 7+ messages in thread
From: Onur Özkan @ 2026-04-02 11:34 UTC (permalink / raw)
  To: Aakash Bollineni via B4 Relay
  Cc: aakash.bollineni, rust-for-linux, linux-kernel

On Thu, 02 Apr 2026 08:53:48 +0530
Aakash Bollineni via B4 Relay
<devnull+aakash.bollineni.multicorewareinc.com@kernel.org> wrote:

> From: Aakash Bollineni <aakash.bollineni@multicorewareinc.com>
> 
> To ensure the safety and correctness of the improved workqueue API,
> this patch adds comprehensive testing infrastructure:
> 1. KUnit Tests: Adds an internal 'rust_kernel_workqueue' test suite
>    to rust/kernel/workqueue.rs. These tests verify basic and
>    synchronous cancellation, refcount stability, and concurrency
>    safety for both standard and delayed work.
> 2. Sample Module: Adds samples/rust/rust_workqueue_test.rs as a
>    standalone module that performs a 1000-iteration stress test
>    designed to verify race-free pointer handover during concurrent
>    enqueue/cancel operations.
> The tests use distinct types for standard and delayed work items to
> ensure clear trait dispatch and prevent field offset conflicts.
> 

This seems like the opposite what you did with the TestItem type? It
contains both the standard and delayed work inside.

> Signed-off-by: Aakash Bollineni
> <aakash.bollineni@multicorewareinc.com> ---
>  samples/rust/Makefile               |   2 +
>  samples/rust/rust_workqueue_test.rs | 214
> ++++++++++++++++++++++++++++++++++++ 2 files changed, 216
> insertions(+)
> 
> diff --git a/samples/rust/Makefile b/samples/rust/Makefile
> index 6c0aaa58cccc..0f304bd90997 100644
> --- a/samples/rust/Makefile
> +++ b/samples/rust/Makefile
> @@ -20,3 +20,5 @@ obj-$(CONFIG_SAMPLE_RUST_SOC)
> 	+= rust_soc.o rust_print-y := rust_print_main.o
> rust_print_events.o 
>  subdir-$(CONFIG_SAMPLE_RUST_HOSTPROGS)		+= hostprogs
> +
> +obj-y += rust_workqueue_test.o
> diff --git a/samples/rust/rust_workqueue_test.rs
> b/samples/rust/rust_workqueue_test.rs new file mode 100644
> index 000000000000..c055ec1964c0
> --- /dev/null
> +++ b/samples/rust/rust_workqueue_test.rs
> @@ -0,0 +1,214 @@
> +// SPDX-License-Identifier: GPL-2.0
> +
> +//! Robust stress test for Rust workqueue API.
> +
> +use kernel::prelude::*;
> +use kernel::sync::Arc;
> +use kernel::time::msecs_to_jiffies;
> +use kernel::workqueue::{self, new_work, Work, WorkItem};
> +
> +#[pin_data]
> +struct TestItem {
> +    #[pin]
> +    work: Work<TestItem>,
> +    value: i32,
> +    #[pin]
> +    delayed_work: workqueue::DelayedWork<TestItem>,
> +}
> +
> +kernel::impl_has_work! {
> +    impl HasWork<Self> for TestItem { self.work }
> +}
> +
> +// SAFETY: The `delayed_work` field is at a fixed offset and is
> valid for the lifetime of +// `TestItem`.
> +unsafe impl workqueue::HasDelayedWork<Self> for TestItem {}
> +
> +impl WorkItem for TestItem {
> +    type Pointer = Arc<TestItem>;
> +
> +    fn run(this: Arc<TestItem>) {
> +        pr_info!(
> +            "Rust workqueue test: Work item running (value: {})\n",
> +            this.value
> +        );
> +    }
> +}
> +
> +/// Helper to get Arc strong count for verification in tests.
> +/// This uses internal layout knowledge of Arc.
> +fn get_arc_count<T: Sized>(arc: &Arc<T>) -> i32 {
> +    // SAFETY: ArcInner has refcount as its first field. Arc points
> to data at DATA_OFFSET.
> +    unsafe {
> +        let ptr = Arc::as_ptr(arc);
> +        let inner_ptr = (ptr as *const
> u8).sub(Arc::<T>::DATA_OFFSET);
> +        // The first field of ArcInner is Refcount, which is a
> transparent wrapper around
> +        // refcount_t. In the kernel, refcount_t is an atomic_t
> (i32).
> +        let refcount_ptr = inner_ptr as *const i32;
> +        // We use a relaxed load to get the current value.
> +        core::ptr::read_volatile(refcount_ptr)
> +    }
> +}
> +
> +struct RustWorkqueueTest;
> +
> +impl kernel::Module for RustWorkqueueTest {
> +    fn init(_module: &'static ThisModule) -> Result<Self> {
> +        pr_info!("Rust workqueue test: starting robust
> verification\n"); +
> +        // 1. Basic Lifecycle with Refcount Validation
> +        {
> +            let test_item = Arc::pin_init(
> +                pin_init!(TestItem {
> +                    work <- new_work!("TestItem::work"),
> +                    value: 42,
> +                    delayed_work <-
> workqueue::new_delayed_work!("TestItem::delayed_work"),
> +                }),
> +                GFP_KERNEL,
> +            )?;
> +
> +            let initial_count = get_arc_count(&test_item);
> +            pr_info!("Initial Arc strong count: {}\n",
> initial_count); +
> +            // Enqueue
> +            let enqueued_item = test_item.clone();
> +
> +            if let Err(returned_item) =
> workqueue::system().enqueue(enqueued_item) {
> +                pr_warn!("Work already pending, unexpected!\n");
> +                let _ = returned_item;
> +            } else {
> +                pr_info!(
> +                    "Work enqueued successfully. Strong count: {}\n",
> +                    get_arc_count(&test_item)
> +                );
> +            }
> +
> +            // Cancel immediately (best effort)
> +            if let Some(reclaimed) = test_item.work.cancel() {
> +                let count_after_cancel = get_arc_count(&test_item);
> +                pr_info!(
> +                    "Success: Work cancelled and Arc reclaimed.
> Strong count: {}\n",
> +                    count_after_cancel
> +                );
> +
> +                // VALIDATION: Reclamation must restore the refcount
> (minus the clone we just
> +                // reclaimed)
> +                if count_after_cancel != initial_count + 1 {
> +                    pr_err!(
> +                        "ERROR: Refcount mismatch after cancel!
> Expected {}, got {}\n",
> +                        initial_count + 1,
> +                        count_after_cancel
> +                    );
> +                    return Err(ENXIO);
> +                }
> +                drop(reclaimed);
> +                if get_arc_count(&test_item) != initial_count {
> +                    pr_err!(
> +                        "ERROR: Refcount mismatch after drop!
> Expected {}, got {}\n",
> +                        initial_count,
> +                        get_arc_count(&test_item)
> +                    );
> +                    return Err(ENXIO);
> +                }
> +            } else {
> +                pr_info!("Work already running or finished, could
> not reclaim via cancel().\n");
> +            }
> +        }
> +
> +        // 2. Stress Testing: Enqueue/Cancel Sync Loop
> +        {
> +            pr_info!("Starting stress test (1000 iterations)...\n");
> +            let test_item = Arc::pin_init(
> +                pin_init!(TestItem {
> +                    work <- new_work!("TestItem::work"),
> +                    value: 99,
> +                    delayed_work <-
> workqueue::new_delayed_work!("TestItem::delayed_work"),
> +                }),
> +                GFP_KERNEL,
> +            )?;
> +
> +            for i in 0..1000 {
> +                let _ =
> workqueue::system().enqueue(test_item.clone());
> +                // Use cancel_sync to ensure determinism for the
> next iteration
> +                let _ = test_item.work.cancel_sync();
> +
> +                if i % 250 == 0 {
> +                    pr_info!("Stress test progress: {}/1000\n", i);
> +                }
> +            }
> +
> +            if get_arc_count(&test_item) != 1 {
> +                pr_err!(
> +                    "ERROR: Refcount leak detected after stress
> test! count: {}\n",
> +                    get_arc_count(&test_item)
> +                );
> +                return Err(ENXIO);
> +            } else {
> +                pr_info!("Stress test completed successfully. No
> refcount leaks.\n");
> +            }
> +        }
> +
> +        // 3. Delayed Work Cancellation Test
> +        {
> +            let test_item = Arc::pin_init(
> +                pin_init!(TestItem {
> +                    work <- new_work!("TestItem::work"),
> +                    value: 7,
> +                    delayed_work <-
> workqueue::new_delayed_work!("TestItem::delayed_work"),
> +                }),
> +                GFP_KERNEL,
> +            )?;
> +
> +            let initial_count = get_arc_count(&test_item);
> +
> +            // Schedule with a long delay
> +            let to_enqueue = test_item.clone();
> +            let res = workqueue::system()
> +                .enqueue_delayed(to_enqueue, msecs_to_jiffies(5000));
> +            if let Err(returned) = res {
> +                pr_warn!("Delayed work already pending, returned
> item.\n");
> +                drop(returned);
> +            } else {
> +                pr_info!("Delayed work enqueued. count: {}\n",
> get_arc_count(&test_item));
> +            }
> +
> +            if test_item.delayed_work.is_pending() {
> +                pr_info!("Delayed work is pending as expected.\n");
> +            }
> +
> +            if let Some(reclaimed) = test_item.delayed_work.cancel()
> {
> +                pr_info!("Success: Delayed work reclaimed. No
> leak.\n");
> +                drop(reclaimed);
> +            } else {
> +                pr_warn!("Notice: Delayed work not reclaimed
> (running or never enqueued).\n");
> +            }
> +
> +            let final_count = get_arc_count(&test_item);
> +            if final_count != initial_count {
> +                pr_err!(
> +                    "ERROR: Refcount leak after delayed cancel!
> expected {}, got {}\n",
> +                    initial_count,
> +                    final_count
> +                );
> +                return Err(ENXIO);
> +            }
> +        }
> +
> +        pr_info!("Rust workqueue test: all robust checks passed\n");
> +        Ok(RustWorkqueueTest)
> +    }
> +}
> +
> +impl Drop for RustWorkqueueTest {
> +    fn drop(&mut self) {
> +        pr_info!("Rust workqueue test: exit\n");
> +    }
> +}
> +
> +module! {
> +    type: RustWorkqueueTest,
> +    name: "rust_workqueue_test",
> +    authors: ["Aakash Bollineni"],
> +    description: "Robust stress test for Rust workqueue API",
> +    license: "GPL",
> +}
> 


^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2026-04-02 11:34 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-04-02  3:23 [PATCH 0/3] rust: workqueue: add safe cancellation and status methods Aakash Bollineni via B4 Relay
2026-04-02  3:23 ` [PATCH 1/3] rust: helpers: add workqueue helpers Aakash Bollineni via B4 Relay
2026-04-02  3:23 ` [PATCH 2/3] rust: workqueue: add safe cancellation and status methods Aakash Bollineni via B4 Relay
2026-04-02  3:23 ` [PATCH 3/3] rust: workqueue: add KUnit and sample stress tests Aakash Bollineni via B4 Relay
2026-04-02 11:23   ` Onur Özkan
2026-04-02 11:34   ` Onur Özkan
2026-04-02 11:19 ` [PATCH 0/3] rust: workqueue: add safe cancellation and status methods Onur Özkan

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox