From: Alice Ryhl <aliceryhl@google.com>
To: Boqun Feng <boqun@kernel.org>
Cc: Christian Brauner <brauner@kernel.org>,
Boqun Feng <boqun.feng@gmail.com>,
"Paul E. McKenney" <paulmck@kernel.org>,
Gary Guo <gary@garyguo.net>,
Greg Kroah-Hartman <gregkh@linuxfoundation.org>,
Carlos Llamas <cmllamas@google.com>,
linux-fsdevel@vger.kernel.org, rust-for-linux@vger.kernel.org,
linux-kernel@vger.kernel.org
Subject: Re: [PATCH v2 1/2] rust: poll: make PollCondVar upgradable
Date: Wed, 4 Mar 2026 07:59:59 +0000 [thread overview]
Message-ID: <aafmf5icyPIFcwf_@google.com> (raw)
In-Reply-To: <aadbyBmaV8zCYiog@tardis.local>
On Tue, Mar 03, 2026 at 02:08:08PM -0800, Boqun Feng wrote:
> On Fri, Feb 13, 2026 at 11:29:41AM +0000, Alice Ryhl wrote:
> > Rust Binder currently uses PollCondVar, but it calls synchronize_rcu()
> > in the destructor, which we would like to avoid. Add a variation of
> > PollCondVar, which uses kfree_rcu() instead.
> >
> > Signed-off-by: Alice Ryhl <aliceryhl@google.com>
> > ---
> > rust/kernel/sync/poll.rs | 160 ++++++++++++++++++++++++++++++++++++++++++++++-
> > 1 file changed, 159 insertions(+), 1 deletion(-)
> >
> > diff --git a/rust/kernel/sync/poll.rs b/rust/kernel/sync/poll.rs
> > index 0ec985d560c8d3405c08dbd86e48b14c7c34484d..9555f818a24d777dd908fca849015c3490ce38d3 100644
> > --- a/rust/kernel/sync/poll.rs
> > +++ b/rust/kernel/sync/poll.rs
> > @@ -5,12 +5,21 @@
> > //! Utilities for working with `struct poll_table`.
> >
> > use crate::{
> > + alloc::AllocError,
> > bindings,
> > + container_of,
> > fs::File,
> > prelude::*,
> > + sync::atomic::{Acquire, Atomic, Relaxed, Release},
> > + sync::lock::{Backend, Lock},
> > sync::{CondVar, LockClassKey},
> > + types::Opaque, //
> > +};
> > +use core::{
> > + marker::{PhantomData, PhantomPinned},
> > + ops::Deref,
> > + ptr,
> > };
> > -use core::{marker::PhantomData, ops::Deref};
> >
> > /// Creates a [`PollCondVar`] initialiser with the given name and a newly-created lock class.
> > #[macro_export]
> > @@ -66,6 +75,7 @@ pub fn register_wait(&self, file: &File, cv: &PollCondVar) {
> > ///
> > /// [`CondVar`]: crate::sync::CondVar
> > #[pin_data(PinnedDrop)]
> > +#[repr(transparent)]
> > pub struct PollCondVar {
> > #[pin]
> > inner: CondVar,
> > @@ -78,6 +88,17 @@ pub fn new(name: &'static CStr, key: Pin<&'static LockClassKey>) -> impl PinInit
> > inner <- CondVar::new(name, key),
> > })
> > }
> > +
> > + /// Use this `CondVar` as a `PollCondVar`.
> > + ///
> > + /// # Safety
> > + ///
> > + /// After the last use of the returned `&PollCondVar`, `__wake_up_pollfree` must be called on
> > + /// the `wait_queue_head` at least one grace period before the `CondVar` is destroyed.
> > + unsafe fn from_non_poll(c: &CondVar) -> &PollCondVar {
> > + // SAFETY: Layout is the same. Caller ensures that PollTables are cleared in time.
> > + unsafe { &*ptr::from_ref(c).cast() }
> > + }
> > }
> >
> > // Make the `CondVar` methods callable on `PollCondVar`.
> > @@ -104,3 +125,140 @@ fn drop(self: Pin<&mut Self>) {
> > unsafe { bindings::synchronize_rcu() };
> > }
> > }
> > +
> > +/// Wrapper around [`CondVar`] that can be upgraded to [`PollCondVar`].
> > +///
> > +/// By using this wrapper, you can avoid rcu for cases that don't use [`PollTable`], and in all
> > +/// cases you can avoid `synchronize_rcu()`.
> > +///
> > +/// # Invariants
> > +///
> > +/// `active` either references `simple`, or a `kmalloc` allocation holding an
> > +/// `UpgradePollCondVarInner`. In the latter case, the allocation remains valid until
> > +/// `Self::drop()` plus one grace period.
> > +#[pin_data(PinnedDrop)]
> > +pub struct UpgradePollCondVar {
> > + #[pin]
> > + simple: CondVar,
> > + active: Atomic<*const CondVar>,
> > + #[pin]
> > + _pin: PhantomPinned,
> > +}
> > +
> > +#[pin_data]
> > +#[repr(C)]
> > +struct UpgradePollCondVarInner {
> > + #[pin]
> > + upgraded: CondVar,
> > + #[pin]
> > + rcu: Opaque<bindings::callback_head>,
> > +}
> > +
> > +impl UpgradePollCondVar {
> > + /// Constructs a new upgradable condvar initialiser.
> > + pub fn new(name: &'static CStr, key: Pin<&'static LockClassKey>) -> impl PinInit<Self> {
> > + pin_init!(&this in Self {
> > + simple <- CondVar::new(name, key),
> > + // SAFETY: `this->simple` is in-bounds. Pointer remains valid since this type is
> > + // pinned.
> > + active: Atomic::new(unsafe { &raw const (*this.as_ptr()).simple }),
> > + _pin: PhantomPinned,
> > + })
> > + }
> > +
> > + /// Obtain a [`PollCondVar`], upgrading if necessary.
> > + ///
> > + /// You should use the same lock as what is passed to the `wait_*` methods. Otherwise wakeups
> > + /// may be missed.
> > + pub fn poll<T: ?Sized, B: Backend>(
> > + &self,
> > + lock: &Lock<T, B>,
> > + name: &'static CStr,
> > + key: Pin<&'static LockClassKey>,
> > + ) -> Result<&PollCondVar, AllocError> {
> > + let mut ptr = self.active.load(Acquire);
> > + if ptr::eq(ptr, &self.simple) {
> > + self.upgrade(lock, name, key)?;
> > + ptr = self.active.load(Acquire);
> > + debug_assert_ne!(ptr, ptr::from_ref(&self.simple));
> > + }
> > + // SAFETY: Signature ensures that last use of returned `&PollCondVar` is before drop(), and
> > + // drop() calls `__wake_up_pollfree` followed by waiting a grace period before the
> > + // `CondVar` is destroyed.
> > + Ok(unsafe { PollCondVar::from_non_poll(&*ptr) })
> > + }
> > +
> > + fn upgrade<T: ?Sized, B: Backend>(
> > + &self,
> > + lock: &Lock<T, B>,
> > + name: &'static CStr,
> > + key: Pin<&'static LockClassKey>,
> > + ) -> Result<(), AllocError> {
> > + let upgraded = KBox::pin_init(
> > + pin_init!(UpgradePollCondVarInner {
> > + upgraded <- CondVar::new(name, key),
> > + rcu: Opaque::uninit(),
> > + }),
> > + GFP_KERNEL,
> > + )
> > + .map_err(|_| AllocError)?;
> > +
> > + // SAFETY: The value is treated as pinned.
> > + let upgraded = KBox::into_raw(unsafe { Pin::into_inner_unchecked(upgraded) });
> > +
> > + let res = self.active.cmpxchg(
> > + ptr::from_ref(&self.simple),
> > + // SAFETY: This operation stays in-bounds of the above allocation.
> > + unsafe { &raw mut (*upgraded).upgraded },
> > + Release,
> > + );
> > +
> > + if res.is_err() {
> > + // Already upgraded, so still succeess.
> > + // SAFETY: The cmpxchg failed, so take back ownership of the box.
> > + drop(unsafe { KBox::from_raw(upgraded) });
> > + return Ok(());
> > + }
> > +
> > + // If a normal waiter registers in parallel with us, then either:
> > + // * We took the lock first. In that case, the waiter sees the above cmpxchg.
> > + // * They took the lock first. In that case, we wake them up below.
> > + drop(lock.lock());
> > + self.simple.notify_all();
>
> Hmm.. what if the waiter gets its `&CondVar` before `upgrade()` and use
> that directly?
>
> <waiter> <in upgrade()>
> let poll_cv: &UpgradePollCondVar = ...;
> let cv = poll_cv.deref();
> cmpxchg();
> drop(lock.lock());
> self.simple.notify_all();
> let mut guard = lock.lock();
> cv.wait(&mut guard);
>
> we still miss the wake-up, right?
>
> It's creative, but I particularly hate we use an empty lock critical
> section to synchronize ;-)
I guess instead of exposing Deref, I can just implement `wait` directly
on `UpgradePollCondVar`. Then this API misuse is not possible.
> Do you think the complexity of a dynamic upgrading is worthwhile, or we
> should just use the box-allocated PollCondVar unconditionally?
>
> I think if the current users won't benefit from the dynamic upgrading
> then we can avoid the complexity. We can always add it back later.
> Thoughts?
I do actually think it's worthwhile to consider:
I started an Android device running this. It created 3961 instances of
`UpgradePollCondVar` during the hour it ran, but only 5 were upgraded.
Alice
next prev parent reply other threads:[~2026-03-04 8:00 UTC|newest]
Thread overview: 8+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-02-13 11:29 [PATCH v2 0/2] Avoid synchronize_rcu() for every thread drop in Rust Binder Alice Ryhl
2026-02-13 11:29 ` [PATCH v2 1/2] rust: poll: make PollCondVar upgradable Alice Ryhl
2026-03-03 22:08 ` Boqun Feng
2026-03-04 7:59 ` Alice Ryhl [this message]
2026-03-04 16:29 ` Boqun Feng
2026-03-04 21:37 ` Alice Ryhl
2026-03-04 23:36 ` Boqun Feng
2026-02-13 11:29 ` [PATCH v2 2/2] rust_binder: use UpgradePollCondVar Alice Ryhl
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=aafmf5icyPIFcwf_@google.com \
--to=aliceryhl@google.com \
--cc=boqun.feng@gmail.com \
--cc=boqun@kernel.org \
--cc=brauner@kernel.org \
--cc=cmllamas@google.com \
--cc=gary@garyguo.net \
--cc=gregkh@linuxfoundation.org \
--cc=linux-fsdevel@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=paulmck@kernel.org \
--cc=rust-for-linux@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.