From: Alice Ryhl <aliceryhl@google.com>
To: Boqun Feng <boqun@kernel.org>
Cc: Christian Brauner <brauner@kernel.org>,
Boqun Feng <boqun.feng@gmail.com>,
"Paul E. McKenney" <paulmck@kernel.org>,
Gary Guo <gary@garyguo.net>,
Greg Kroah-Hartman <gregkh@linuxfoundation.org>,
Carlos Llamas <cmllamas@google.com>,
linux-fsdevel@vger.kernel.org, rust-for-linux@vger.kernel.org,
linux-kernel@vger.kernel.org
Subject: Re: [PATCH v2 1/2] rust: poll: make PollCondVar upgradable
Date: Wed, 4 Mar 2026 07:59:59 +0000 [thread overview]
Message-ID: <aafmf5icyPIFcwf_@google.com> (raw)
In-Reply-To: <aadbyBmaV8zCYiog@tardis.local>
On Tue, Mar 03, 2026 at 02:08:08PM -0800, Boqun Feng wrote:
> On Fri, Feb 13, 2026 at 11:29:41AM +0000, Alice Ryhl wrote:
> > Rust Binder currently uses PollCondVar, but it calls synchronize_rcu()
> > in the destructor, which we would like to avoid. Add a variation of
> > PollCondVar, which uses kfree_rcu() instead.
> >
> > Signed-off-by: Alice Ryhl <aliceryhl@google.com>
> > ---
> > rust/kernel/sync/poll.rs | 160 ++++++++++++++++++++++++++++++++++++++++++++++-
> > 1 file changed, 159 insertions(+), 1 deletion(-)
> >
> > diff --git a/rust/kernel/sync/poll.rs b/rust/kernel/sync/poll.rs
> > index 0ec985d560c8d3405c08dbd86e48b14c7c34484d..9555f818a24d777dd908fca849015c3490ce38d3 100644
> > --- a/rust/kernel/sync/poll.rs
> > +++ b/rust/kernel/sync/poll.rs
> > @@ -5,12 +5,21 @@
> > //! Utilities for working with `struct poll_table`.
> >
> > use crate::{
> > + alloc::AllocError,
> > bindings,
> > + container_of,
> > fs::File,
> > prelude::*,
> > + sync::atomic::{Acquire, Atomic, Relaxed, Release},
> > + sync::lock::{Backend, Lock},
> > sync::{CondVar, LockClassKey},
> > + types::Opaque, //
> > +};
> > +use core::{
> > + marker::{PhantomData, PhantomPinned},
> > + ops::Deref,
> > + ptr,
> > };
> > -use core::{marker::PhantomData, ops::Deref};
> >
> > /// Creates a [`PollCondVar`] initialiser with the given name and a newly-created lock class.
> > #[macro_export]
> > @@ -66,6 +75,7 @@ pub fn register_wait(&self, file: &File, cv: &PollCondVar) {
> > ///
> > /// [`CondVar`]: crate::sync::CondVar
> > #[pin_data(PinnedDrop)]
> > +#[repr(transparent)]
> > pub struct PollCondVar {
> > #[pin]
> > inner: CondVar,
> > @@ -78,6 +88,17 @@ pub fn new(name: &'static CStr, key: Pin<&'static LockClassKey>) -> impl PinInit
> > inner <- CondVar::new(name, key),
> > })
> > }
> > +
> > + /// Use this `CondVar` as a `PollCondVar`.
> > + ///
> > + /// # Safety
> > + ///
> > + /// After the last use of the returned `&PollCondVar`, `__wake_up_pollfree` must be called on
> > + /// the `wait_queue_head` at least one grace period before the `CondVar` is destroyed.
> > + unsafe fn from_non_poll(c: &CondVar) -> &PollCondVar {
> > + // SAFETY: Layout is the same. Caller ensures that PollTables are cleared in time.
> > + unsafe { &*ptr::from_ref(c).cast() }
> > + }
> > }
> >
> > // Make the `CondVar` methods callable on `PollCondVar`.
> > @@ -104,3 +125,140 @@ fn drop(self: Pin<&mut Self>) {
> > unsafe { bindings::synchronize_rcu() };
> > }
> > }
> > +
> > +/// Wrapper around [`CondVar`] that can be upgraded to [`PollCondVar`].
> > +///
> > +/// By using this wrapper, you can avoid rcu for cases that don't use [`PollTable`], and in all
> > +/// cases you can avoid `synchronize_rcu()`.
> > +///
> > +/// # Invariants
> > +///
> > +/// `active` either references `simple`, or a `kmalloc` allocation holding an
> > +/// `UpgradePollCondVarInner`. In the latter case, the allocation remains valid until
> > +/// `Self::drop()` plus one grace period.
> > +#[pin_data(PinnedDrop)]
> > +pub struct UpgradePollCondVar {
> > + #[pin]
> > + simple: CondVar,
> > + active: Atomic<*const CondVar>,
> > + #[pin]
> > + _pin: PhantomPinned,
> > +}
> > +
> > +#[pin_data]
> > +#[repr(C)]
> > +struct UpgradePollCondVarInner {
> > + #[pin]
> > + upgraded: CondVar,
> > + #[pin]
> > + rcu: Opaque<bindings::callback_head>,
> > +}
> > +
> > +impl UpgradePollCondVar {
> > + /// Constructs a new upgradable condvar initialiser.
> > + pub fn new(name: &'static CStr, key: Pin<&'static LockClassKey>) -> impl PinInit<Self> {
> > + pin_init!(&this in Self {
> > + simple <- CondVar::new(name, key),
> > + // SAFETY: `this->simple` is in-bounds. Pointer remains valid since this type is
> > + // pinned.
> > + active: Atomic::new(unsafe { &raw const (*this.as_ptr()).simple }),
> > + _pin: PhantomPinned,
> > + })
> > + }
> > +
> > + /// Obtain a [`PollCondVar`], upgrading if necessary.
> > + ///
> > + /// You should use the same lock as what is passed to the `wait_*` methods. Otherwise wakeups
> > + /// may be missed.
> > + pub fn poll<T: ?Sized, B: Backend>(
> > + &self,
> > + lock: &Lock<T, B>,
> > + name: &'static CStr,
> > + key: Pin<&'static LockClassKey>,
> > + ) -> Result<&PollCondVar, AllocError> {
> > + let mut ptr = self.active.load(Acquire);
> > + if ptr::eq(ptr, &self.simple) {
> > + self.upgrade(lock, name, key)?;
> > + ptr = self.active.load(Acquire);
> > + debug_assert_ne!(ptr, ptr::from_ref(&self.simple));
> > + }
> > + // SAFETY: Signature ensures that last use of returned `&PollCondVar` is before drop(), and
> > + // drop() calls `__wake_up_pollfree` followed by waiting a grace period before the
> > + // `CondVar` is destroyed.
> > + Ok(unsafe { PollCondVar::from_non_poll(&*ptr) })
> > + }
> > +
> > + fn upgrade<T: ?Sized, B: Backend>(
> > + &self,
> > + lock: &Lock<T, B>,
> > + name: &'static CStr,
> > + key: Pin<&'static LockClassKey>,
> > + ) -> Result<(), AllocError> {
> > + let upgraded = KBox::pin_init(
> > + pin_init!(UpgradePollCondVarInner {
> > + upgraded <- CondVar::new(name, key),
> > + rcu: Opaque::uninit(),
> > + }),
> > + GFP_KERNEL,
> > + )
> > + .map_err(|_| AllocError)?;
> > +
> > + // SAFETY: The value is treated as pinned.
> > + let upgraded = KBox::into_raw(unsafe { Pin::into_inner_unchecked(upgraded) });
> > +
> > + let res = self.active.cmpxchg(
> > + ptr::from_ref(&self.simple),
> > + // SAFETY: This operation stays in-bounds of the above allocation.
> > + unsafe { &raw mut (*upgraded).upgraded },
> > + Release,
> > + );
> > +
> > + if res.is_err() {
> > + // Already upgraded, so still succeess.
> > + // SAFETY: The cmpxchg failed, so take back ownership of the box.
> > + drop(unsafe { KBox::from_raw(upgraded) });
> > + return Ok(());
> > + }
> > +
> > + // If a normal waiter registers in parallel with us, then either:
> > + // * We took the lock first. In that case, the waiter sees the above cmpxchg.
> > + // * They took the lock first. In that case, we wake them up below.
> > + drop(lock.lock());
> > + self.simple.notify_all();
>
> Hmm.. what if the waiter gets its `&CondVar` before `upgrade()` and use
> that directly?
>
> <waiter> <in upgrade()>
> let poll_cv: &UpgradePollCondVar = ...;
> let cv = poll_cv.deref();
> cmpxchg();
> drop(lock.lock());
> self.simple.notify_all();
> let mut guard = lock.lock();
> cv.wait(&mut guard);
>
> we still miss the wake-up, right?
>
> It's creative, but I particularly hate we use an empty lock critical
> section to synchronize ;-)
I guess instead of exposing Deref, I can just implement `wait` directly
on `UpgradePollCondVar`. Then this API misuse is not possible.
> Do you think the complexity of a dynamic upgrading is worthwhile, or we
> should just use the box-allocated PollCondVar unconditionally?
>
> I think if the current users won't benefit from the dynamic upgrading
> then we can avoid the complexity. We can always add it back later.
> Thoughts?
I do actually think it's worthwhile to consider:
I started an Android device running this. It created 3961 instances of
`UpgradePollCondVar` during the hour it ran, but only 5 were upgraded.
Alice
next prev parent reply other threads:[~2026-03-04 8:00 UTC|newest]
Thread overview: 8+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-02-13 11:29 [PATCH v2 0/2] Avoid synchronize_rcu() for every thread drop in Rust Binder Alice Ryhl
2026-02-13 11:29 ` [PATCH v2 1/2] rust: poll: make PollCondVar upgradable Alice Ryhl
2026-03-03 22:08 ` Boqun Feng
2026-03-04 7:59 ` Alice Ryhl [this message]
2026-03-04 16:29 ` Boqun Feng
2026-03-04 21:37 ` Alice Ryhl
2026-03-04 23:36 ` Boqun Feng
2026-02-13 11:29 ` [PATCH v2 2/2] rust_binder: use UpgradePollCondVar Alice Ryhl
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=aafmf5icyPIFcwf_@google.com \
--to=aliceryhl@google.com \
--cc=boqun.feng@gmail.com \
--cc=boqun@kernel.org \
--cc=brauner@kernel.org \
--cc=cmllamas@google.com \
--cc=gary@garyguo.net \
--cc=gregkh@linuxfoundation.org \
--cc=linux-fsdevel@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=paulmck@kernel.org \
--cc=rust-for-linux@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox