All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v2 0/2] Avoid synchronize_rcu() for every thread drop in Rust Binder
@ 2026-02-13 11:29 Alice Ryhl
  2026-02-13 11:29 ` [PATCH v2 1/2] rust: poll: make PollCondVar upgradable Alice Ryhl
  2026-02-13 11:29 ` [PATCH v2 2/2] rust_binder: use UpgradePollCondVar Alice Ryhl
  0 siblings, 2 replies; 8+ messages in thread
From: Alice Ryhl @ 2026-02-13 11:29 UTC (permalink / raw)
  To: Christian Brauner, Boqun Feng, Paul E. McKenney
  Cc: Gary Guo, Greg Kroah-Hartman, Carlos Llamas, linux-fsdevel,
	rust-for-linux, linux-kernel, Alice Ryhl

Right now Rust Binder calls synchronize_rcu() more often than is
necessary. Most processes do not use epoll at all, so they don't require
rcu here. Back in Kangrejos I came up with a way to avoid this. Idea is
to move the value that needs rcu to a separate allocation that's easy to
kfree_rcu(). We pay the allocation only when the proc uses epoll using
an "upgrade" strategy - most processes don't.

One solution that may be better is to just kfree_rcu() the Binder Thread
struct directly when it's refcount drops to zero (Drop of everything
else can still run without a grace period). But I'm not sure how I would
achieve that - after all Thread is also used with kernel::list.

Signed-off-by: Alice Ryhl <aliceryhl@google.com>
---
Changes in v2:
- Change how Rust Binder handles the lock class key.
- Rebase.
- Link to v1: https://lore.kernel.org/r/20260117-upgrade-poll-v1-0-179437b7bd49@google.com

---
Alice Ryhl (2):
      rust: poll: make PollCondVar upgradable
      rust_binder: use UpgradePollCondVar

 drivers/android/binder/process.rs |   2 +-
 drivers/android/binder/thread.rs  |  25 +++---
 rust/kernel/sync/poll.rs          | 160 +++++++++++++++++++++++++++++++++++++-
 3 files changed, 176 insertions(+), 11 deletions(-)
---
base-commit: de718b2ca866e10e2a26c259ab0493a5af411879
change-id: 20260117-upgrade-poll-37ee2a7a79dd

Best regards,
-- 
Alice Ryhl <aliceryhl@google.com>


^ permalink raw reply	[flat|nested] 8+ messages in thread

* [PATCH v2 1/2] rust: poll: make PollCondVar upgradable
  2026-02-13 11:29 [PATCH v2 0/2] Avoid synchronize_rcu() for every thread drop in Rust Binder Alice Ryhl
@ 2026-02-13 11:29 ` Alice Ryhl
  2026-03-03 22:08   ` Boqun Feng
  2026-02-13 11:29 ` [PATCH v2 2/2] rust_binder: use UpgradePollCondVar Alice Ryhl
  1 sibling, 1 reply; 8+ messages in thread
From: Alice Ryhl @ 2026-02-13 11:29 UTC (permalink / raw)
  To: Christian Brauner, Boqun Feng, Paul E. McKenney
  Cc: Gary Guo, Greg Kroah-Hartman, Carlos Llamas, linux-fsdevel,
	rust-for-linux, linux-kernel, Alice Ryhl

Rust Binder currently uses PollCondVar, but it calls synchronize_rcu()
in the destructor, which we would like to avoid. Add a variation of
PollCondVar, which uses kfree_rcu() instead.

Signed-off-by: Alice Ryhl <aliceryhl@google.com>
---
 rust/kernel/sync/poll.rs | 160 ++++++++++++++++++++++++++++++++++++++++++++++-
 1 file changed, 159 insertions(+), 1 deletion(-)

diff --git a/rust/kernel/sync/poll.rs b/rust/kernel/sync/poll.rs
index 0ec985d560c8d3405c08dbd86e48b14c7c34484d..9555f818a24d777dd908fca849015c3490ce38d3 100644
--- a/rust/kernel/sync/poll.rs
+++ b/rust/kernel/sync/poll.rs
@@ -5,12 +5,21 @@
 //! Utilities for working with `struct poll_table`.
 
 use crate::{
+    alloc::AllocError,
     bindings,
+    container_of,
     fs::File,
     prelude::*,
+    sync::atomic::{Acquire, Atomic, Relaxed, Release},
+    sync::lock::{Backend, Lock},
     sync::{CondVar, LockClassKey},
+    types::Opaque, //
+};
+use core::{
+    marker::{PhantomData, PhantomPinned},
+    ops::Deref,
+    ptr,
 };
-use core::{marker::PhantomData, ops::Deref};
 
 /// Creates a [`PollCondVar`] initialiser with the given name and a newly-created lock class.
 #[macro_export]
@@ -66,6 +75,7 @@ pub fn register_wait(&self, file: &File, cv: &PollCondVar) {
 ///
 /// [`CondVar`]: crate::sync::CondVar
 #[pin_data(PinnedDrop)]
+#[repr(transparent)]
 pub struct PollCondVar {
     #[pin]
     inner: CondVar,
@@ -78,6 +88,17 @@ pub fn new(name: &'static CStr, key: Pin<&'static LockClassKey>) -> impl PinInit
             inner <- CondVar::new(name, key),
         })
     }
+
+    /// Use this `CondVar` as a `PollCondVar`.
+    ///
+    /// # Safety
+    ///
+    /// After the last use of the returned `&PollCondVar`, `__wake_up_pollfree` must be called on
+    /// the `wait_queue_head` at least one grace period before the `CondVar` is destroyed.
+    unsafe fn from_non_poll(c: &CondVar) -> &PollCondVar {
+        // SAFETY: Layout is the same. Caller ensures that PollTables are cleared in time.
+        unsafe { &*ptr::from_ref(c).cast() }
+    }
 }
 
 // Make the `CondVar` methods callable on `PollCondVar`.
@@ -104,3 +125,140 @@ fn drop(self: Pin<&mut Self>) {
         unsafe { bindings::synchronize_rcu() };
     }
 }
+
+/// Wrapper around [`CondVar`] that can be upgraded to [`PollCondVar`].
+///
+/// By using this wrapper, you can avoid rcu for cases that don't use [`PollTable`], and in all
+/// cases you can avoid `synchronize_rcu()`.
+///
+/// # Invariants
+///
+/// `active` either references `simple`, or a `kmalloc` allocation holding an
+/// `UpgradePollCondVarInner`. In the latter case, the allocation remains valid until
+/// `Self::drop()` plus one grace period.
+#[pin_data(PinnedDrop)]
+pub struct UpgradePollCondVar {
+    #[pin]
+    simple: CondVar,
+    active: Atomic<*const CondVar>,
+    #[pin]
+    _pin: PhantomPinned,
+}
+
+#[pin_data]
+#[repr(C)]
+struct UpgradePollCondVarInner {
+    #[pin]
+    upgraded: CondVar,
+    #[pin]
+    rcu: Opaque<bindings::callback_head>,
+}
+
+impl UpgradePollCondVar {
+    /// Constructs a new upgradable condvar initialiser.
+    pub fn new(name: &'static CStr, key: Pin<&'static LockClassKey>) -> impl PinInit<Self> {
+        pin_init!(&this in Self {
+            simple <- CondVar::new(name, key),
+            // SAFETY: `this->simple` is in-bounds. Pointer remains valid since this type is
+            // pinned.
+            active: Atomic::new(unsafe { &raw const (*this.as_ptr()).simple }),
+            _pin: PhantomPinned,
+        })
+    }
+
+    /// Obtain a [`PollCondVar`], upgrading if necessary.
+    ///
+    /// You should use the same lock as what is passed to the `wait_*` methods. Otherwise wakeups
+    /// may be missed.
+    pub fn poll<T: ?Sized, B: Backend>(
+        &self,
+        lock: &Lock<T, B>,
+        name: &'static CStr,
+        key: Pin<&'static LockClassKey>,
+    ) -> Result<&PollCondVar, AllocError> {
+        let mut ptr = self.active.load(Acquire);
+        if ptr::eq(ptr, &self.simple) {
+            self.upgrade(lock, name, key)?;
+            ptr = self.active.load(Acquire);
+            debug_assert_ne!(ptr, ptr::from_ref(&self.simple));
+        }
+        // SAFETY: Signature ensures that last use of returned `&PollCondVar` is before drop(), and
+        // drop() calls `__wake_up_pollfree` followed by waiting a grace period before the
+        // `CondVar` is destroyed.
+        Ok(unsafe { PollCondVar::from_non_poll(&*ptr) })
+    }
+
+    fn upgrade<T: ?Sized, B: Backend>(
+        &self,
+        lock: &Lock<T, B>,
+        name: &'static CStr,
+        key: Pin<&'static LockClassKey>,
+    ) -> Result<(), AllocError> {
+        let upgraded = KBox::pin_init(
+            pin_init!(UpgradePollCondVarInner {
+                upgraded <- CondVar::new(name, key),
+                rcu: Opaque::uninit(),
+            }),
+            GFP_KERNEL,
+        )
+        .map_err(|_| AllocError)?;
+
+        // SAFETY: The value is treated as pinned.
+        let upgraded = KBox::into_raw(unsafe { Pin::into_inner_unchecked(upgraded) });
+
+        let res = self.active.cmpxchg(
+            ptr::from_ref(&self.simple),
+            // SAFETY: This operation stays in-bounds of the above allocation.
+            unsafe { &raw mut (*upgraded).upgraded },
+            Release,
+        );
+
+        if res.is_err() {
+            // Already upgraded, so still succeess.
+            // SAFETY: The cmpxchg failed, so take back ownership of the box.
+            drop(unsafe { KBox::from_raw(upgraded) });
+            return Ok(());
+        }
+
+        // If a normal waiter registers in parallel with us, then either:
+        // * We took the lock first. In that case, the waiter sees the above cmpxchg.
+        // * They took the lock first. In that case, we wake them up below.
+        drop(lock.lock());
+        self.simple.notify_all();
+
+        Ok(())
+    }
+}
+
+// Make the `CondVar` methods callable on `UpgradePollCondVar`.
+impl Deref for UpgradePollCondVar {
+    type Target = CondVar;
+
+    fn deref(&self) -> &CondVar {
+        // SAFETY: By the type invariants, this is either `&self.simple` or references an
+        // allocation that lives until `UpgradePollCondVar::drop`.
+        unsafe { &*self.active.load(Acquire) }
+    }
+}
+
+#[pinned_drop]
+impl PinnedDrop for UpgradePollCondVar {
+    #[inline]
+    fn drop(self: Pin<&mut Self>) {
+        // ORDERING: All calls to upgrade happens-before Drop, so no synchronization is required.
+        let ptr = self.active.load(Relaxed);
+        if ptr::eq(ptr, &self.simple) {
+            return;
+        }
+        // SAFETY: When the pointer is not &self.active, it is an `UpgradePollCondVarInner`.
+        let ptr = unsafe { container_of!(ptr.cast_mut(), UpgradePollCondVarInner, upgraded) };
+        // SAFETY: The pointer points at a valid `wait_queue_head`.
+        unsafe { bindings::__wake_up_pollfree((*ptr).upgraded.wait_queue_head.get()) };
+        // This skips drop of `CondVar`, but that's ok because we reimplemented its drop here.
+        //
+        // SAFETY: `__wake_up_pollfree` ensures that all registered PollTable instances are gone in
+        // one grace period, and this is the destructor so no new PollTable instances can be
+        // registered. Thus, it's safety to rcu free the `UpgradePollCondVarInner`.
+        unsafe { bindings::kvfree_call_rcu((*ptr).rcu.get(), ptr.cast::<c_void>()) };
+    }
+}

-- 
2.53.0.273.g2a3d683680-goog


^ permalink raw reply related	[flat|nested] 8+ messages in thread

* [PATCH v2 2/2] rust_binder: use UpgradePollCondVar
  2026-02-13 11:29 [PATCH v2 0/2] Avoid synchronize_rcu() for every thread drop in Rust Binder Alice Ryhl
  2026-02-13 11:29 ` [PATCH v2 1/2] rust: poll: make PollCondVar upgradable Alice Ryhl
@ 2026-02-13 11:29 ` Alice Ryhl
  1 sibling, 0 replies; 8+ messages in thread
From: Alice Ryhl @ 2026-02-13 11:29 UTC (permalink / raw)
  To: Christian Brauner, Boqun Feng, Paul E. McKenney
  Cc: Gary Guo, Greg Kroah-Hartman, Carlos Llamas, linux-fsdevel,
	rust-for-linux, linux-kernel, Alice Ryhl

Most processes do not use Rust Binder with epoll, so avoid paying the
synchronize_rcu() cost in drop for those that don't need it. For those
that do, we also manage to replace synchronize_rcu() with kfree_rcu(),
though we introduce an extra allocation.

In case the last ref to an Arc<Thread> is dropped outside of
deferred_release(), this also ensures that synchronize_rcu() is not
called in destructor of Arc<Thread> in other places. Theoretically that
could lead to jank by making other syscalls slow, though I have not seen
it happen in practice.

Signed-off-by: Alice Ryhl <aliceryhl@google.com>
---
 drivers/android/binder/process.rs |  2 +-
 drivers/android/binder/thread.rs  | 25 ++++++++++++++++---------
 2 files changed, 17 insertions(+), 10 deletions(-)

diff --git a/drivers/android/binder/process.rs b/drivers/android/binder/process.rs
index 132055b4790f0ec69a87635b498909df2bf475e2..9374f1a86766c09321b57e565b6317cc290ea32b 100644
--- a/drivers/android/binder/process.rs
+++ b/drivers/android/binder/process.rs
@@ -1684,7 +1684,7 @@ pub(crate) fn poll(
         table: PollTable<'_>,
     ) -> Result<u32> {
         let thread = this.get_current_thread()?;
-        let (from_proc, mut mask) = thread.poll(file, table);
+        let (from_proc, mut mask) = thread.poll(file, table)?;
         if mask == 0 && from_proc && !this.inner.lock().work.is_empty() {
             mask |= bindings::POLLIN;
         }
diff --git a/drivers/android/binder/thread.rs b/drivers/android/binder/thread.rs
index 82264db06507d4641b60cbed96af482a9d36e7b2..a07210405c64e19984f49777a9d2c7b218944755 100644
--- a/drivers/android/binder/thread.rs
+++ b/drivers/android/binder/thread.rs
@@ -16,8 +16,8 @@
     seq_file::SeqFile,
     seq_print,
     sync::atomic::{ordering::Relaxed, Atomic},
-    sync::poll::{PollCondVar, PollTable},
-    sync::{Arc, SpinLock},
+    sync::poll::{PollTable, UpgradePollCondVar},
+    sync::{Arc, LockClassKey, SpinLock},
     task::Task,
     types::ARef,
     uaccess::UserSlice,
@@ -35,7 +35,7 @@
     BinderReturnWriter, DArc, DLArc, DTRWrap, DeliverCode, DeliverToRead,
 };
 
-use core::mem::size_of;
+use core::{mem::size_of, pin::Pin};
 
 /// Stores the layout of the scatter-gather entries. This is used during the `translate_objects`
 /// call and is discarded when it returns.
@@ -412,7 +412,7 @@ pub(crate) struct Thread {
     #[pin]
     inner: SpinLock<InnerThread>,
     #[pin]
-    work_condvar: PollCondVar,
+    work_condvar: UpgradePollCondVar,
     /// Used to insert this thread into the process' `ready_threads` list.
     ///
     /// INVARIANT: May never be used for any other list than the `self.process.ready_threads`.
@@ -433,6 +433,11 @@ impl ListItem<0> for Thread {
     }
 }
 
+const THREAD_CONDVAR_NAME: &CStr = c"Thread::work_condvar";
+fn thread_condvar_class() -> Pin<&'static LockClassKey> {
+    kernel::static_lock_class!()
+}
+
 impl Thread {
     pub(crate) fn new(id: i32, process: Arc<Process>) -> Result<Arc<Self>> {
         let inner = InnerThread::new()?;
@@ -443,7 +448,7 @@ pub(crate) fn new(id: i32, process: Arc<Process>) -> Result<Arc<Self>> {
                 process,
                 task: ARef::from(&**kernel::current!()),
                 inner <- kernel::new_spinlock!(inner, "Thread::inner"),
-                work_condvar <- kernel::new_poll_condvar!("Thread::work_condvar"),
+                work_condvar <- UpgradePollCondVar::new(THREAD_CONDVAR_NAME, thread_condvar_class()),
                 links <- ListLinks::new(),
                 links_track <- AtomicTracker::new(),
             }),
@@ -1484,10 +1489,13 @@ pub(crate) fn write_read(self: &Arc<Self>, data: UserSlice, wait: bool) -> Resul
         ret
     }
 
-    pub(crate) fn poll(&self, file: &File, table: PollTable<'_>) -> (bool, u32) {
-        table.register_wait(file, &self.work_condvar);
+    pub(crate) fn poll(&self, file: &File, table: PollTable<'_>) -> Result<(bool, u32)> {
+        let condvar =
+            self.work_condvar
+                .poll(&self.inner, THREAD_CONDVAR_NAME, thread_condvar_class())?;
+        table.register_wait(file, condvar);
         let mut inner = self.inner.lock();
-        (inner.should_use_process_work_queue(), inner.poll())
+        Ok((inner.should_use_process_work_queue(), inner.poll()))
     }
 
     /// Make the call to `get_work` or `get_work_local` return immediately, if any.
@@ -1523,7 +1531,6 @@ pub(crate) fn notify_if_poll_ready(&self, sync: bool) {
     pub(crate) fn release(self: &Arc<Self>) {
         self.inner.lock().is_dead = true;
 
-        //self.work_condvar.clear();
         self.unwind_transaction_stack();
 
         // Cancel all pending work items.

-- 
2.53.0.273.g2a3d683680-goog


^ permalink raw reply related	[flat|nested] 8+ messages in thread

* Re: [PATCH v2 1/2] rust: poll: make PollCondVar upgradable
  2026-02-13 11:29 ` [PATCH v2 1/2] rust: poll: make PollCondVar upgradable Alice Ryhl
@ 2026-03-03 22:08   ` Boqun Feng
  2026-03-04  7:59     ` Alice Ryhl
  0 siblings, 1 reply; 8+ messages in thread
From: Boqun Feng @ 2026-03-03 22:08 UTC (permalink / raw)
  To: Alice Ryhl
  Cc: Christian Brauner, Boqun Feng, Paul E. McKenney, Gary Guo,
	Greg Kroah-Hartman, Carlos Llamas, linux-fsdevel, rust-for-linux,
	linux-kernel

On Fri, Feb 13, 2026 at 11:29:41AM +0000, Alice Ryhl wrote:
> Rust Binder currently uses PollCondVar, but it calls synchronize_rcu()
> in the destructor, which we would like to avoid. Add a variation of
> PollCondVar, which uses kfree_rcu() instead.
> 
> Signed-off-by: Alice Ryhl <aliceryhl@google.com>
> ---
>  rust/kernel/sync/poll.rs | 160 ++++++++++++++++++++++++++++++++++++++++++++++-
>  1 file changed, 159 insertions(+), 1 deletion(-)
> 
> diff --git a/rust/kernel/sync/poll.rs b/rust/kernel/sync/poll.rs
> index 0ec985d560c8d3405c08dbd86e48b14c7c34484d..9555f818a24d777dd908fca849015c3490ce38d3 100644
> --- a/rust/kernel/sync/poll.rs
> +++ b/rust/kernel/sync/poll.rs
> @@ -5,12 +5,21 @@
>  //! Utilities for working with `struct poll_table`.
>  
>  use crate::{
> +    alloc::AllocError,
>      bindings,
> +    container_of,
>      fs::File,
>      prelude::*,
> +    sync::atomic::{Acquire, Atomic, Relaxed, Release},
> +    sync::lock::{Backend, Lock},
>      sync::{CondVar, LockClassKey},
> +    types::Opaque, //
> +};
> +use core::{
> +    marker::{PhantomData, PhantomPinned},
> +    ops::Deref,
> +    ptr,
>  };
> -use core::{marker::PhantomData, ops::Deref};
>  
>  /// Creates a [`PollCondVar`] initialiser with the given name and a newly-created lock class.
>  #[macro_export]
> @@ -66,6 +75,7 @@ pub fn register_wait(&self, file: &File, cv: &PollCondVar) {
>  ///
>  /// [`CondVar`]: crate::sync::CondVar
>  #[pin_data(PinnedDrop)]
> +#[repr(transparent)]
>  pub struct PollCondVar {
>      #[pin]
>      inner: CondVar,
> @@ -78,6 +88,17 @@ pub fn new(name: &'static CStr, key: Pin<&'static LockClassKey>) -> impl PinInit
>              inner <- CondVar::new(name, key),
>          })
>      }
> +
> +    /// Use this `CondVar` as a `PollCondVar`.
> +    ///
> +    /// # Safety
> +    ///
> +    /// After the last use of the returned `&PollCondVar`, `__wake_up_pollfree` must be called on
> +    /// the `wait_queue_head` at least one grace period before the `CondVar` is destroyed.
> +    unsafe fn from_non_poll(c: &CondVar) -> &PollCondVar {
> +        // SAFETY: Layout is the same. Caller ensures that PollTables are cleared in time.
> +        unsafe { &*ptr::from_ref(c).cast() }
> +    }
>  }
>  
>  // Make the `CondVar` methods callable on `PollCondVar`.
> @@ -104,3 +125,140 @@ fn drop(self: Pin<&mut Self>) {
>          unsafe { bindings::synchronize_rcu() };
>      }
>  }
> +
> +/// Wrapper around [`CondVar`] that can be upgraded to [`PollCondVar`].
> +///
> +/// By using this wrapper, you can avoid rcu for cases that don't use [`PollTable`], and in all
> +/// cases you can avoid `synchronize_rcu()`.
> +///
> +/// # Invariants
> +///
> +/// `active` either references `simple`, or a `kmalloc` allocation holding an
> +/// `UpgradePollCondVarInner`. In the latter case, the allocation remains valid until
> +/// `Self::drop()` plus one grace period.
> +#[pin_data(PinnedDrop)]
> +pub struct UpgradePollCondVar {
> +    #[pin]
> +    simple: CondVar,
> +    active: Atomic<*const CondVar>,
> +    #[pin]
> +    _pin: PhantomPinned,
> +}
> +
> +#[pin_data]
> +#[repr(C)]
> +struct UpgradePollCondVarInner {
> +    #[pin]
> +    upgraded: CondVar,
> +    #[pin]
> +    rcu: Opaque<bindings::callback_head>,
> +}
> +
> +impl UpgradePollCondVar {
> +    /// Constructs a new upgradable condvar initialiser.
> +    pub fn new(name: &'static CStr, key: Pin<&'static LockClassKey>) -> impl PinInit<Self> {
> +        pin_init!(&this in Self {
> +            simple <- CondVar::new(name, key),
> +            // SAFETY: `this->simple` is in-bounds. Pointer remains valid since this type is
> +            // pinned.
> +            active: Atomic::new(unsafe { &raw const (*this.as_ptr()).simple }),
> +            _pin: PhantomPinned,
> +        })
> +    }
> +
> +    /// Obtain a [`PollCondVar`], upgrading if necessary.
> +    ///
> +    /// You should use the same lock as what is passed to the `wait_*` methods. Otherwise wakeups
> +    /// may be missed.
> +    pub fn poll<T: ?Sized, B: Backend>(
> +        &self,
> +        lock: &Lock<T, B>,
> +        name: &'static CStr,
> +        key: Pin<&'static LockClassKey>,
> +    ) -> Result<&PollCondVar, AllocError> {
> +        let mut ptr = self.active.load(Acquire);
> +        if ptr::eq(ptr, &self.simple) {
> +            self.upgrade(lock, name, key)?;
> +            ptr = self.active.load(Acquire);
> +            debug_assert_ne!(ptr, ptr::from_ref(&self.simple));
> +        }
> +        // SAFETY: Signature ensures that last use of returned `&PollCondVar` is before drop(), and
> +        // drop() calls `__wake_up_pollfree` followed by waiting a grace period before the
> +        // `CondVar` is destroyed.
> +        Ok(unsafe { PollCondVar::from_non_poll(&*ptr) })
> +    }
> +
> +    fn upgrade<T: ?Sized, B: Backend>(
> +        &self,
> +        lock: &Lock<T, B>,
> +        name: &'static CStr,
> +        key: Pin<&'static LockClassKey>,
> +    ) -> Result<(), AllocError> {
> +        let upgraded = KBox::pin_init(
> +            pin_init!(UpgradePollCondVarInner {
> +                upgraded <- CondVar::new(name, key),
> +                rcu: Opaque::uninit(),
> +            }),
> +            GFP_KERNEL,
> +        )
> +        .map_err(|_| AllocError)?;
> +
> +        // SAFETY: The value is treated as pinned.
> +        let upgraded = KBox::into_raw(unsafe { Pin::into_inner_unchecked(upgraded) });
> +
> +        let res = self.active.cmpxchg(
> +            ptr::from_ref(&self.simple),
> +            // SAFETY: This operation stays in-bounds of the above allocation.
> +            unsafe { &raw mut (*upgraded).upgraded },
> +            Release,
> +        );
> +
> +        if res.is_err() {
> +            // Already upgraded, so still succeess.
> +            // SAFETY: The cmpxchg failed, so take back ownership of the box.
> +            drop(unsafe { KBox::from_raw(upgraded) });
> +            return Ok(());
> +        }
> +
> +        // If a normal waiter registers in parallel with us, then either:
> +        // * We took the lock first. In that case, the waiter sees the above cmpxchg.
> +        // * They took the lock first. In that case, we wake them up below.
> +        drop(lock.lock());
> +        self.simple.notify_all();

Hmm.. what if the waiter gets its `&CondVar` before `upgrade()` and use
that directly?

	<waiter>				<in upgrade()>
	let poll_cv: &UpgradePollCondVar = ...;
	let cv = poll_cv.deref();
						cmpxchg();
						drop(lock.lock());
						self.simple.notify_all();
	let mut guard = lock.lock();
	cv.wait(&mut guard);

we still miss the wake-up, right?

It's creative, but I particularly hate we use an empty lock critical
section to synchronize ;-)

Do you think the complexity of a dynamic upgrading is worthwhile, or we
should just use the box-allocated PollCondVar unconditionally?

I think if the current users won't benefit from the dynamic upgrading
then we can avoid the complexity. We can always add it back later.
Thoughts?

Regards,
Boqun

> +
> +        Ok(())
> +    }
> +}
> +
> +// Make the `CondVar` methods callable on `UpgradePollCondVar`.
> +impl Deref for UpgradePollCondVar {
> +    type Target = CondVar;
> +
> +    fn deref(&self) -> &CondVar {
> +        // SAFETY: By the type invariants, this is either `&self.simple` or references an
> +        // allocation that lives until `UpgradePollCondVar::drop`.
> +        unsafe { &*self.active.load(Acquire) }
> +    }
> +}
> +
> +#[pinned_drop]
> +impl PinnedDrop for UpgradePollCondVar {
> +    #[inline]
> +    fn drop(self: Pin<&mut Self>) {
> +        // ORDERING: All calls to upgrade happens-before Drop, so no synchronization is required.
> +        let ptr = self.active.load(Relaxed);
> +        if ptr::eq(ptr, &self.simple) {
> +            return;
> +        }
> +        // SAFETY: When the pointer is not &self.active, it is an `UpgradePollCondVarInner`.
> +        let ptr = unsafe { container_of!(ptr.cast_mut(), UpgradePollCondVarInner, upgraded) };
> +        // SAFETY: The pointer points at a valid `wait_queue_head`.
> +        unsafe { bindings::__wake_up_pollfree((*ptr).upgraded.wait_queue_head.get()) };
> +        // This skips drop of `CondVar`, but that's ok because we reimplemented its drop here.
> +        //
> +        // SAFETY: `__wake_up_pollfree` ensures that all registered PollTable instances are gone in
> +        // one grace period, and this is the destructor so no new PollTable instances can be
> +        // registered. Thus, it's safety to rcu free the `UpgradePollCondVarInner`.
> +        unsafe { bindings::kvfree_call_rcu((*ptr).rcu.get(), ptr.cast::<c_void>()) };
> +    }
> +}
> 
> -- 
> 2.53.0.273.g2a3d683680-goog
> 

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH v2 1/2] rust: poll: make PollCondVar upgradable
  2026-03-03 22:08   ` Boqun Feng
@ 2026-03-04  7:59     ` Alice Ryhl
  2026-03-04 16:29       ` Boqun Feng
  0 siblings, 1 reply; 8+ messages in thread
From: Alice Ryhl @ 2026-03-04  7:59 UTC (permalink / raw)
  To: Boqun Feng
  Cc: Christian Brauner, Boqun Feng, Paul E. McKenney, Gary Guo,
	Greg Kroah-Hartman, Carlos Llamas, linux-fsdevel, rust-for-linux,
	linux-kernel

On Tue, Mar 03, 2026 at 02:08:08PM -0800, Boqun Feng wrote:
> On Fri, Feb 13, 2026 at 11:29:41AM +0000, Alice Ryhl wrote:
> > Rust Binder currently uses PollCondVar, but it calls synchronize_rcu()
> > in the destructor, which we would like to avoid. Add a variation of
> > PollCondVar, which uses kfree_rcu() instead.
> > 
> > Signed-off-by: Alice Ryhl <aliceryhl@google.com>
> > ---
> >  rust/kernel/sync/poll.rs | 160 ++++++++++++++++++++++++++++++++++++++++++++++-
> >  1 file changed, 159 insertions(+), 1 deletion(-)
> > 
> > diff --git a/rust/kernel/sync/poll.rs b/rust/kernel/sync/poll.rs
> > index 0ec985d560c8d3405c08dbd86e48b14c7c34484d..9555f818a24d777dd908fca849015c3490ce38d3 100644
> > --- a/rust/kernel/sync/poll.rs
> > +++ b/rust/kernel/sync/poll.rs
> > @@ -5,12 +5,21 @@
> >  //! Utilities for working with `struct poll_table`.
> >  
> >  use crate::{
> > +    alloc::AllocError,
> >      bindings,
> > +    container_of,
> >      fs::File,
> >      prelude::*,
> > +    sync::atomic::{Acquire, Atomic, Relaxed, Release},
> > +    sync::lock::{Backend, Lock},
> >      sync::{CondVar, LockClassKey},
> > +    types::Opaque, //
> > +};
> > +use core::{
> > +    marker::{PhantomData, PhantomPinned},
> > +    ops::Deref,
> > +    ptr,
> >  };
> > -use core::{marker::PhantomData, ops::Deref};
> >  
> >  /// Creates a [`PollCondVar`] initialiser with the given name and a newly-created lock class.
> >  #[macro_export]
> > @@ -66,6 +75,7 @@ pub fn register_wait(&self, file: &File, cv: &PollCondVar) {
> >  ///
> >  /// [`CondVar`]: crate::sync::CondVar
> >  #[pin_data(PinnedDrop)]
> > +#[repr(transparent)]
> >  pub struct PollCondVar {
> >      #[pin]
> >      inner: CondVar,
> > @@ -78,6 +88,17 @@ pub fn new(name: &'static CStr, key: Pin<&'static LockClassKey>) -> impl PinInit
> >              inner <- CondVar::new(name, key),
> >          })
> >      }
> > +
> > +    /// Use this `CondVar` as a `PollCondVar`.
> > +    ///
> > +    /// # Safety
> > +    ///
> > +    /// After the last use of the returned `&PollCondVar`, `__wake_up_pollfree` must be called on
> > +    /// the `wait_queue_head` at least one grace period before the `CondVar` is destroyed.
> > +    unsafe fn from_non_poll(c: &CondVar) -> &PollCondVar {
> > +        // SAFETY: Layout is the same. Caller ensures that PollTables are cleared in time.
> > +        unsafe { &*ptr::from_ref(c).cast() }
> > +    }
> >  }
> >  
> >  // Make the `CondVar` methods callable on `PollCondVar`.
> > @@ -104,3 +125,140 @@ fn drop(self: Pin<&mut Self>) {
> >          unsafe { bindings::synchronize_rcu() };
> >      }
> >  }
> > +
> > +/// Wrapper around [`CondVar`] that can be upgraded to [`PollCondVar`].
> > +///
> > +/// By using this wrapper, you can avoid rcu for cases that don't use [`PollTable`], and in all
> > +/// cases you can avoid `synchronize_rcu()`.
> > +///
> > +/// # Invariants
> > +///
> > +/// `active` either references `simple`, or a `kmalloc` allocation holding an
> > +/// `UpgradePollCondVarInner`. In the latter case, the allocation remains valid until
> > +/// `Self::drop()` plus one grace period.
> > +#[pin_data(PinnedDrop)]
> > +pub struct UpgradePollCondVar {
> > +    #[pin]
> > +    simple: CondVar,
> > +    active: Atomic<*const CondVar>,
> > +    #[pin]
> > +    _pin: PhantomPinned,
> > +}
> > +
> > +#[pin_data]
> > +#[repr(C)]
> > +struct UpgradePollCondVarInner {
> > +    #[pin]
> > +    upgraded: CondVar,
> > +    #[pin]
> > +    rcu: Opaque<bindings::callback_head>,
> > +}
> > +
> > +impl UpgradePollCondVar {
> > +    /// Constructs a new upgradable condvar initialiser.
> > +    pub fn new(name: &'static CStr, key: Pin<&'static LockClassKey>) -> impl PinInit<Self> {
> > +        pin_init!(&this in Self {
> > +            simple <- CondVar::new(name, key),
> > +            // SAFETY: `this->simple` is in-bounds. Pointer remains valid since this type is
> > +            // pinned.
> > +            active: Atomic::new(unsafe { &raw const (*this.as_ptr()).simple }),
> > +            _pin: PhantomPinned,
> > +        })
> > +    }
> > +
> > +    /// Obtain a [`PollCondVar`], upgrading if necessary.
> > +    ///
> > +    /// You should use the same lock as what is passed to the `wait_*` methods. Otherwise wakeups
> > +    /// may be missed.
> > +    pub fn poll<T: ?Sized, B: Backend>(
> > +        &self,
> > +        lock: &Lock<T, B>,
> > +        name: &'static CStr,
> > +        key: Pin<&'static LockClassKey>,
> > +    ) -> Result<&PollCondVar, AllocError> {
> > +        let mut ptr = self.active.load(Acquire);
> > +        if ptr::eq(ptr, &self.simple) {
> > +            self.upgrade(lock, name, key)?;
> > +            ptr = self.active.load(Acquire);
> > +            debug_assert_ne!(ptr, ptr::from_ref(&self.simple));
> > +        }
> > +        // SAFETY: Signature ensures that last use of returned `&PollCondVar` is before drop(), and
> > +        // drop() calls `__wake_up_pollfree` followed by waiting a grace period before the
> > +        // `CondVar` is destroyed.
> > +        Ok(unsafe { PollCondVar::from_non_poll(&*ptr) })
> > +    }
> > +
> > +    fn upgrade<T: ?Sized, B: Backend>(
> > +        &self,
> > +        lock: &Lock<T, B>,
> > +        name: &'static CStr,
> > +        key: Pin<&'static LockClassKey>,
> > +    ) -> Result<(), AllocError> {
> > +        let upgraded = KBox::pin_init(
> > +            pin_init!(UpgradePollCondVarInner {
> > +                upgraded <- CondVar::new(name, key),
> > +                rcu: Opaque::uninit(),
> > +            }),
> > +            GFP_KERNEL,
> > +        )
> > +        .map_err(|_| AllocError)?;
> > +
> > +        // SAFETY: The value is treated as pinned.
> > +        let upgraded = KBox::into_raw(unsafe { Pin::into_inner_unchecked(upgraded) });
> > +
> > +        let res = self.active.cmpxchg(
> > +            ptr::from_ref(&self.simple),
> > +            // SAFETY: This operation stays in-bounds of the above allocation.
> > +            unsafe { &raw mut (*upgraded).upgraded },
> > +            Release,
> > +        );
> > +
> > +        if res.is_err() {
> > +            // Already upgraded, so still succeess.
> > +            // SAFETY: The cmpxchg failed, so take back ownership of the box.
> > +            drop(unsafe { KBox::from_raw(upgraded) });
> > +            return Ok(());
> > +        }
> > +
> > +        // If a normal waiter registers in parallel with us, then either:
> > +        // * We took the lock first. In that case, the waiter sees the above cmpxchg.
> > +        // * They took the lock first. In that case, we wake them up below.
> > +        drop(lock.lock());
> > +        self.simple.notify_all();
> 
> Hmm.. what if the waiter gets its `&CondVar` before `upgrade()` and use
> that directly?
> 
> 	<waiter>				<in upgrade()>
> 	let poll_cv: &UpgradePollCondVar = ...;
> 	let cv = poll_cv.deref();
> 						cmpxchg();
> 						drop(lock.lock());
> 						self.simple.notify_all();
> 	let mut guard = lock.lock();
> 	cv.wait(&mut guard);
> 
> we still miss the wake-up, right?
> 
> It's creative, but I particularly hate we use an empty lock critical
> section to synchronize ;-)

I guess instead of exposing Deref, I can just implement `wait` directly
on `UpgradePollCondVar`. Then this API misuse is not possible.

> Do you think the complexity of a dynamic upgrading is worthwhile, or we
> should just use the box-allocated PollCondVar unconditionally?
> 
> I think if the current users won't benefit from the dynamic upgrading
> then we can avoid the complexity. We can always add it back later.
> Thoughts?

I do actually think it's worthwhile to consider:

I started an Android device running this. It created 3961 instances of
`UpgradePollCondVar` during the hour it ran, but only 5 were upgraded.

Alice

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH v2 1/2] rust: poll: make PollCondVar upgradable
  2026-03-04  7:59     ` Alice Ryhl
@ 2026-03-04 16:29       ` Boqun Feng
  2026-03-04 21:37         ` Alice Ryhl
  0 siblings, 1 reply; 8+ messages in thread
From: Boqun Feng @ 2026-03-04 16:29 UTC (permalink / raw)
  To: Alice Ryhl
  Cc: Christian Brauner, Boqun Feng, Paul E. McKenney, Gary Guo,
	Greg Kroah-Hartman, Carlos Llamas, linux-fsdevel, rust-for-linux,
	linux-kernel

On Wed, Mar 04, 2026 at 07:59:59AM +0000, Alice Ryhl wrote:
[...]
> > > +        // If a normal waiter registers in parallel with us, then either:
> > > +        // * We took the lock first. In that case, the waiter sees the above cmpxchg.
> > > +        // * They took the lock first. In that case, we wake them up below.
> > > +        drop(lock.lock());
> > > +        self.simple.notify_all();
> > 
> > Hmm.. what if the waiter gets its `&CondVar` before `upgrade()` and use
> > that directly?
> > 
> > 	<waiter>				<in upgrade()>
> > 	let poll_cv: &UpgradePollCondVar = ...;
> > 	let cv = poll_cv.deref();
> > 						cmpxchg();
> > 						drop(lock.lock());
> > 						self.simple.notify_all();
> > 	let mut guard = lock.lock();
> > 	cv.wait(&mut guard);
> > 
> > we still miss the wake-up, right?
> > 
> > It's creative, but I particularly hate we use an empty lock critical
> > section to synchronize ;-)
> 
> I guess instead of exposing Deref, I can just implement `wait` directly
> on `UpgradePollCondVar`. Then this API misuse is not possible.
> 

If we do that,then we can avoid the `drop(lock.lock())` as well, if we
do:

    impl UpgradePollCondVar {
        pub fn wait(...) {
	    prepare_to_wait_exclusive(); // <- this will take lock in
                                         // simple.wait_queue_head. So
                                         // either upgrade() comes
                                         // first, or they observe the
                                         // wait being queued.
            let cv_ptr = self.active.load(Relaxed);
	    if !ptr_eq(cv_ptr, &self.simple) { // We have moved from
	                                       // simple, so need to
                                               // need to wake up and
                                               // redo the wait.
	        finish_wait();
	    } else {
	        guard.do_unlock(|| { schedule_timeout(); });
		finish_wait();
	    }
	}
    }

(CondVar::notify*() will take the wait_queue_head lock as well)

> > Do you think the complexity of a dynamic upgrading is worthwhile, or we
> > should just use the box-allocated PollCondVar unconditionally?
> > 
> > I think if the current users won't benefit from the dynamic upgrading
> > then we can avoid the complexity. We can always add it back later.
> > Thoughts?
> 
> I do actually think it's worthwhile to consider:
> 
> I started an Android device running this. It created 3961 instances of
> `UpgradePollCondVar` during the hour it ran, but only 5 were upgraded.
> 

That makes sense, thank you for providing the data! But still I think we
should be more informative about the performance difference between
dynamic upgrading vs. unconditionally box-allocated PollCondVar, because
I would assume when a `UpgradePollCondVar` is created, other allocations
also happen as well (e.g. when creating a Arc<binder::Thread>), so the
extra cost of the allocation may be unnoticeable.

Regards,
Boqun


> Alice

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH v2 1/2] rust: poll: make PollCondVar upgradable
  2026-03-04 16:29       ` Boqun Feng
@ 2026-03-04 21:37         ` Alice Ryhl
  2026-03-04 23:36           ` Boqun Feng
  0 siblings, 1 reply; 8+ messages in thread
From: Alice Ryhl @ 2026-03-04 21:37 UTC (permalink / raw)
  To: Boqun Feng
  Cc: Christian Brauner, Boqun Feng, Paul E. McKenney, Gary Guo,
	Greg Kroah-Hartman, Carlos Llamas, linux-fsdevel, rust-for-linux,
	linux-kernel

On Wed, Mar 04, 2026 at 08:29:12AM -0800, Boqun Feng wrote:
> On Wed, Mar 04, 2026 at 07:59:59AM +0000, Alice Ryhl wrote:
> [...]
> > > > +        // If a normal waiter registers in parallel with us, then either:
> > > > +        // * We took the lock first. In that case, the waiter sees the above cmpxchg.
> > > > +        // * They took the lock first. In that case, we wake them up below.
> > > > +        drop(lock.lock());
> > > > +        self.simple.notify_all();
> > > 
> > > Hmm.. what if the waiter gets its `&CondVar` before `upgrade()` and use
> > > that directly?
> > > 
> > > 	<waiter>				<in upgrade()>
> > > 	let poll_cv: &UpgradePollCondVar = ...;
> > > 	let cv = poll_cv.deref();
> > > 						cmpxchg();
> > > 						drop(lock.lock());
> > > 						self.simple.notify_all();
> > > 	let mut guard = lock.lock();
> > > 	cv.wait(&mut guard);
> > > 
> > > we still miss the wake-up, right?
> > > 
> > > It's creative, but I particularly hate we use an empty lock critical
> > > section to synchronize ;-)
> > 
> > I guess instead of exposing Deref, I can just implement `wait` directly
> > on `UpgradePollCondVar`. Then this API misuse is not possible.
> > 
> 
> If we do that,then we can avoid the `drop(lock.lock())` as well, if we
> do:
> 
>     impl UpgradePollCondVar {
>         pub fn wait(...) {
> 	    prepare_to_wait_exclusive(); // <- this will take lock in
>                                          // simple.wait_queue_head. So
>                                          // either upgrade() comes
>                                          // first, or they observe the
>                                          // wait being queued.
>             let cv_ptr = self.active.load(Relaxed);
> 	    if !ptr_eq(cv_ptr, &self.simple) { // We have moved from
> 	                                       // simple, so need to
>                                                // need to wake up and
>                                                // redo the wait.
> 	        finish_wait();
> 	    } else {
> 	        guard.do_unlock(|| { schedule_timeout(); });
> 		finish_wait();
> 	    }
> 	}
>     }
> 
> (CondVar::notify*() will take the wait_queue_head lock as well)

Yeah but then I have to actually re-implement those methods and not just
call them. Seems not worth it.

> > > Do you think the complexity of a dynamic upgrading is worthwhile, or we
> > > should just use the box-allocated PollCondVar unconditionally?
> > > 
> > > I think if the current users won't benefit from the dynamic upgrading
> > > then we can avoid the complexity. We can always add it back later.
> > > Thoughts?
> > 
> > I do actually think it's worthwhile to consider:
> > 
> > I started an Android device running this. It created 3961 instances of
> > `UpgradePollCondVar` during the hour it ran, but only 5 were upgraded.
> > 
> 
> That makes sense, thank you for providing the data! But still I think we
> should be more informative about the performance difference between
> dynamic upgrading vs. unconditionally box-allocated PollCondVar, because
> I would assume when a `UpgradePollCondVar` is created, other allocations
> also happen as well (e.g. when creating a Arc<binder::Thread>), so the
> extra cost of the allocation may be unnoticeable.

Perf-wise it doesn't matter, but I'm concerned about memory usage.

Alice

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH v2 1/2] rust: poll: make PollCondVar upgradable
  2026-03-04 21:37         ` Alice Ryhl
@ 2026-03-04 23:36           ` Boqun Feng
  0 siblings, 0 replies; 8+ messages in thread
From: Boqun Feng @ 2026-03-04 23:36 UTC (permalink / raw)
  To: Alice Ryhl
  Cc: Christian Brauner, Boqun Feng, Paul E. McKenney, Gary Guo,
	Greg Kroah-Hartman, Carlos Llamas, linux-fsdevel, rust-for-linux,
	linux-kernel

On Wed, Mar 04, 2026 at 09:37:45PM +0000, Alice Ryhl wrote:
> On Wed, Mar 04, 2026 at 08:29:12AM -0800, Boqun Feng wrote:
> > On Wed, Mar 04, 2026 at 07:59:59AM +0000, Alice Ryhl wrote:
> > [...]
> > > > > +        // If a normal waiter registers in parallel with us, then either:
> > > > > +        // * We took the lock first. In that case, the waiter sees the above cmpxchg.
> > > > > +        // * They took the lock first. In that case, we wake them up below.
> > > > > +        drop(lock.lock());
> > > > > +        self.simple.notify_all();
> > > > 
> > > > Hmm.. what if the waiter gets its `&CondVar` before `upgrade()` and use
> > > > that directly?
> > > > 
> > > > 	<waiter>				<in upgrade()>
> > > > 	let poll_cv: &UpgradePollCondVar = ...;
> > > > 	let cv = poll_cv.deref();
> > > > 						cmpxchg();
> > > > 						drop(lock.lock());
> > > > 						self.simple.notify_all();
> > > > 	let mut guard = lock.lock();
> > > > 	cv.wait(&mut guard);
> > > > 
> > > > we still miss the wake-up, right?
> > > > 
> > > > It's creative, but I particularly hate we use an empty lock critical
> > > > section to synchronize ;-)
> > > 
> > > I guess instead of exposing Deref, I can just implement `wait` directly
> > > on `UpgradePollCondVar`. Then this API misuse is not possible.
> > > 
> > 
> > If we do that,then we can avoid the `drop(lock.lock())` as well, if we
> > do:
> > 
> >     impl UpgradePollCondVar {
> >         pub fn wait(...) {
> > 	    prepare_to_wait_exclusive(); // <- this will take lock in
> >                                          // simple.wait_queue_head. So
> >                                          // either upgrade() comes
> >                                          // first, or they observe the
> >                                          // wait being queued.
> >             let cv_ptr = self.active.load(Relaxed);
> > 	    if !ptr_eq(cv_ptr, &self.simple) { // We have moved from
> > 	                                       // simple, so need to
> >                                                // need to wake up and
> >                                                // redo the wait.
> > 	        finish_wait();
> > 	    } else {
> > 	        guard.do_unlock(|| { schedule_timeout(); });
> > 		finish_wait();
> > 	    }
> > 	}
> >     }
> > 
> > (CondVar::notify*() will take the wait_queue_head lock as well)
> 
> Yeah but then I have to actually re-implement those methods and not just
> call them. Seems not worth it.
> 

We can pass a closure to wait_*() as condition:

    fn wait_internal<T: ?Sized, B: Backend>(
        &self,
        wait_state: c_int,
        guard: &mut Guard<'_, T, B>,
        cond: Some(FnOnce() -> bool),
        timeout_in_jiffies: c_long,
    ) -> c_long {

I'm not just suggesting this because it helps in this case. In a more
general pattern (if you see ___wait_event() macro in
include/linux/wait.h), the condition checking after prepare_to_wait*()
is needed to prevent wake-up misses. So maybe in long-term, we will have
the case that we need to check the condition for `CondVar` as well.

Plus, you don't need to pass a &Lock to poll() if you do this ;-)

> > > > Do you think the complexity of a dynamic upgrading is worthwhile, or we
> > > > should just use the box-allocated PollCondVar unconditionally?
> > > > 
> > > > I think if the current users won't benefit from the dynamic upgrading
> > > > then we can avoid the complexity. We can always add it back later.
> > > > Thoughts?
> > > 
> > > I do actually think it's worthwhile to consider:
> > > 
> > > I started an Android device running this. It created 3961 instances of
> > > `UpgradePollCondVar` during the hour it ran, but only 5 were upgraded.
> > > 
> > 
> > That makes sense, thank you for providing the data! But still I think we
> > should be more informative about the performance difference between
> > dynamic upgrading vs. unconditionally box-allocated PollCondVar, because
> > I would assume when a `UpgradePollCondVar` is created, other allocations
> > also happen as well (e.g. when creating a Arc<binder::Thread>), so the
> > extra cost of the allocation may be unnoticeable.
> 
> Perf-wise it doesn't matter, but I'm concerned about memory usage.
> 

Let's see, we are comparing the memory cost between:

(assuming on a 64-bit system, and LOCKDEP=n)

    struct UpgradePollCondVar {
        simple: CondVar,        // <- 24 bytes (1 spinlock + 2 pointers)
        active: Atomic<*const UpgradePollCondVarInner>, // <- 8 bytes.
                                                        // but +40 extra
                                                        // bytes on the
                                                        // heap in the
                                                        // worst case.
    }

vs

    struct BoxedPollCondVar {
        active: Box<UpgradePollCondVarInner>, // <- 8 bytes, but +40
                                              // extra bytes on the heap
    }

that's extra 16 bytes per binder::Thread, but binder::Thread itself is
more than 100 bytes. Of course it's up to binder whether 16 bytes per
thread is a lot or not, but to me, I would choose the simplicity ;-)

Regards,
Boqun

> Alice

^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2026-03-04 23:36 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-02-13 11:29 [PATCH v2 0/2] Avoid synchronize_rcu() for every thread drop in Rust Binder Alice Ryhl
2026-02-13 11:29 ` [PATCH v2 1/2] rust: poll: make PollCondVar upgradable Alice Ryhl
2026-03-03 22:08   ` Boqun Feng
2026-03-04  7:59     ` Alice Ryhl
2026-03-04 16:29       ` Boqun Feng
2026-03-04 21:37         ` Alice Ryhl
2026-03-04 23:36           ` Boqun Feng
2026-02-13 11:29 ` [PATCH v2 2/2] rust_binder: use UpgradePollCondVar Alice Ryhl

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.