From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from mail-106111.protonmail.ch (mail-106111.protonmail.ch [79.135.106.111]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id D1E8824887E; Mon, 11 May 2026 17:12:08 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=79.135.106.111 ARC-Seal:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1778519532; cv=none; b=X2C0pZbNNlZ5CANJ5G+gbSNV4gNrt/SORStOO8QA41Mg6ve5Qboge9o1WOFls2PmMlWfGYVDoQeofGYoHRFlFfLbvK5gDqmAbfEuV6RKw7lsIRc1stVCzN6Ux8S7b8+0Cif1d5t3St8+TT8BlgNKY/xb5H8moK2TZ5EqfZhp/Jo= ARC-Message-Signature:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1778519532; c=relaxed/simple; bh=ukgKXmaqX2Dzg50fJN6zN/XQQEM7s2uzSuUbfmxoKJM=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version:Content-Type; b=VUkhFqR4H1VqfW1Ad+kluRVqYScUENzMmXMmlRqcRSI9BcfJV9kUp/VIXLPI6U1tpLNyX7C+es/DWBMUWJxQPigq6hpxlLChZ9x2A6puV3tBhtxm/T+Zm2/LcvAnHujcLNbNwmFeUNKgY9nRC3hOyD1oOBgroi5858zlpUAmac0= ARC-Authentication-Results:i=1; smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=onurozkan.dev; spf=pass smtp.mailfrom=onurozkan.dev; dkim=pass (2048-bit key) header.d=onurozkan.dev header.i=@onurozkan.dev header.b=TtnM9l/+; arc=none smtp.client-ip=79.135.106.111 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=onurozkan.dev Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=onurozkan.dev Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=onurozkan.dev header.i=@onurozkan.dev header.b="TtnM9l/+" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=onurozkan.dev; s=protonmail; t=1778519520; x=1778778720; bh=ykhXu0h9b17SPJb8LfhQy3mDHA0xibSXJcBfk9YZ//g=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References:From:To: Cc:Date:Subject:Reply-To:Feedback-ID:Message-ID:BIMI-Selector; b=TtnM9l/+0hvbjWRtS+3tbFqYm1nOjCMW/O09i8hod4did3f9Hf2rRmOaSIzlkjEm3 aJANZD2JurpIwq2iMBMK5Zqa2ggDI6I4YgWDo3T3NNRBLmTjmoamT70nvj/pEDbE6J HudwKrrFoFvocZ4p2yxPa67p+sgsf646os1GK0sxitZqR+k+Jq76akmuSOARdyauwZ aEu57SUZ6+sQ3ezP8gWe2chRiRRdFckO65kM4DLUJLDxJdMRkXo0lrLF6tbhNzoH22 o6phjGq9JivflIIjoh5q+69f+WEDvduJYyslcH1gMwtzpfx/VWAbjOBXjJaA3CYKge tInE1CkYFtpdA== X-Pm-Submission-Id: 4gDmVM6TYYz2Scpy From: =?UTF-8?q?Onur=20=C3=96zkan?= To: Gary Guo Cc: rcu@vger.kernel.org, rust-for-linux@vger.kernel.org, linux-kernel@vger.kernel.org, ojeda@kernel.org, boqun@kernel.org, bjorn3_gh@protonmail.com, lossin@kernel.org, a.hindborg@kernel.org, aliceryhl@google.com, tmgross@umich.edu, dakr@kernel.org, peterz@infradead.org, fujita.tomonori@gmail.com, tamird@kernel.org, jiangshanlai@gmail.com, paulmck@kernel.org, josh@joshtriplett.org, rostedt@goodmis.org, mathieu.desnoyers@efficios.com Subject: Re: [PATCH v2 2/3] rust: sync: add SRCU abstraction Date: Mon, 11 May 2026 20:11:26 +0300 Message-ID: <20260511171154.154280-1-work@onurozkan.dev> X-Mailer: git-send-email 2.51.2 In-Reply-To: References: <20260502162833.34334-1-work@onurozkan.dev> <20260502162833.34334-3-work@onurozkan.dev> <20260503034008.36917-1-work@onurozkan.dev> Precedence: bulk X-Mailing-List: rust-for-linux@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: quoted-printable On Sun, 03 May 2026 20:25:03 +0100=0D Gary Guo wrote:=0D =0D > On Sun May 3, 2026 at 4:39 AM BST, Onur =C3=96zkan wrote:=0D > >> > +/// Sleepable read-copy update primitive.=0D > >> > +///=0D > >> > +/// SRCU readers may sleep while holding the read-side guard.=0D > >> > +///=0D > >> > +/// The destructor may sleep.=0D > >> > +///=0D > >> > +/// # Invariants=0D > >> > +///=0D > >> > +/// This represents a valid `struct srcu_struct` initialized by the= C SRCU API=0D > >> > +/// and it remains pinned and valid until the pinned destructor run= s.=0D > >> > +#[repr(transparent)]=0D > >> > +#[pin_data(PinnedDrop)]=0D > >> > +pub struct Srcu {=0D > >> > + #[pin]=0D > >> > + inner: Opaque,=0D > >> > +}=0D > >> > +=0D > >> > +impl Srcu {=0D > >> > + /// Creates a new SRCU instance.=0D > >> > + #[inline]=0D > >> > + pub fn new(name: &'static CStr, key: Pin<&'static LockClassKey>= ) -> impl PinInit {=0D > >> > + try_pin_init!(Self {=0D > >> > + inner <- Opaque::try_ffi_init(|ptr: *mut bindings::srcu= _struct| {=0D > >> > + // SAFETY: `ptr` points to valid uninitialised memo= ry for a `srcu_struct`.=0D > >> > + to_result(unsafe {=0D > >> > + bindings::init_srcu_struct_with_key(ptr, name.a= s_char_ptr(), key.as_ptr())=0D > >> > + })=0D > >> > + }),=0D > >> > + })=0D > >> > + }=0D > >> > +=0D > >> > + /// Enters an SRCU read-side critical section.=0D > >> > + ///=0D > >> > + /// # Safety=0D > >> > + ///=0D > >> > + /// The returned [`Guard`] must not be leaked. Leaking it with = [`core::mem::forget`]=0D > >> > + /// leaves the SRCU read-side critical section active.=0D > >> =0D > >> I generally would prefer if we could use guard-like API instead of for= cing a=0D > >> callback.=0D > >=0D > > Me too and developers can still do that. I think the safety contract he= re is=0D > > very simple to handle. It's essentially this:=0D > >=0D > > // SAFETY: Guard is not leaked.=0D > > let _guard =3D unsafe { x.read_lock() };=0D > >=0D > > To me it's very simple and straightforward for both the developer and t= he=0D > > reviewer. It doesn't add any overhead to the implementation and it ensu= res=0D > > that the developer (and later the reviewer) is aware of the potential i= ssue.=0D > >=0D > > Of course, there's also the safe option if the developer is happy with= =0D > > closure-based API:=0D > >=0D > > x.with_read_lock(|_guard| {=0D > > ...=0D > > });=0D > >=0D > > So it allows you to use the guard-based approach directly with the requ= irement=0D > > of a safety comment and also provides a safe API for developers who don= 't want=0D > > to deal with that. I am not sure if you fall into the third category, w= hich is=0D > > "I don't like writing safety comments and I don't like the closure-base= d=0D > > approach" :)=0D > =0D > We have been avoiding using callback-based API if there's an alternativel= y way=0D > to achieve this. There has been quite a very precedents with this:=0D > =0D > - spin_lock_irqsave requires taking and releasing in correct order, which= is=0D > easy to solve with a callback approach. The same logic reasoning can be= used=0D > to provide an unsafe API + safe callback API, but Boqun & Lyude reworke= d the=0D > spinlock IRQ design so we don't need that anymore.=0D > =0D > - `Task::current` API could easily be replaced callback-based approach, b= ut we=0D > used a macro to achieve without unsafe.=0D > =0D > The API here is not inherently impossible to use guard API. It's not safe= today=0D > because a very specific detail. The callback-API is the "path of least=0D > resistence" approach, but it's not the optimal one.=0D > =0D > >=0D > >> =0D > >> I suppose the only reason that this is unsafe is the "just leak it" co= ndition=0D > >> when cleaning up SRCU struct, which skips cleaning up delayed work, wh= ich could=0D > >> call into `process_srcu`, which accesses `srcu_struct`. This however i= s *not*=0D > >> leaked, because it's controlled by the user. Only the auxillary data a= llocated=0D > >> by SRCU is leaked. So UAF is going to happen.=0D > >> =0D > >> So in some aspect, the leak caused by `srcu_readers_active(ssp)` can c= ause more=0D > >> damage compared to just continuing cleanup despite active users? I thi= nk this=0D > >> could be changed in one of these ways:=0D > >> * Have SRCU allocate all memory instead, and user-side would just have= a=0D > >> `struct srcu_struct*`; then leaking would be safe. This is probably = a bit=0D > >> difficult to change because it affects many users.=0D > >=0D > > We could do the same on the Rust side only. Basically instead of embedd= ing=0D > > srcu_struct in Rust srcu, allocate it separately and store its pointer.= Then,=0D > > if cleanup hits the active reader case, we could leak that allocation s= o any=0D > > remaining srcu work does not hit UAF. I was aware of this option but I = would=0D > > prefer to avoid it because it adds an extra allocation for every Rust s= rcu.=0D > >=0D > >> * Continue to flush delayed work and stop the timer, and then leak bef= ore the=0D > >> actual kfree happens.=0D > >=0D > > Can you say more? I didn't understand this particular solution.=0D > =0D > I was thinking that doing this _might_ be sufficient. I have to admit tha= t I've=0D > not very familar with the internal implementation of SRCU to make it an=0D > assertion though.=0D > =0D > diff --git a/kernel/rcu/srcutree.c b/kernel/rcu/srcutree.c=0D > index 0d01cd8c4b4a..5d75a4dbb6e5 100644=0D > --- a/kernel/rcu/srcutree.c=0D > +++ b/kernel/rcu/srcutree.c=0D > @@ -717,8 +717,6 @@ void cleanup_srcu_struct(struct srcu_struct *ssp)=0D > raw_spin_unlock_irq_rcu_node(ssp->srcu_sup);=0D > if (WARN_ON(!delay))=0D > return; /* Just leak it! */=0D > - if (WARN_ON(srcu_readers_active(ssp)))=0D > - return; /* Just leak it! */=0D > /* Wait for irq_work to finish first as it may queue a new work. */=0D > irq_work_sync(&sup->irq_work);=0D > flush_delayed_work(&sup->work);=0D > =0D > But after taking another look, I am not even sure if this is needed. A qu= ick=0D > glance of the code it appears that __srcu_read_unlock doesn't do anything= apart=0D > from adjusting the counter, and the SRCU grace period and thus the timers= won't=0D > actually start unless there's a pending grace period, which won't start u= nless=0D > there's a call_srcu or sychronize_srcu. And we *know* that none of them w= ould=0D > happen, as the lifetime guarantees that nothing accesses the `Srcu` struc= t when=0D > `drop` starts, and inside drop we have already invoked `srcu_barrier()`.= =0D > =0D > So I think, even if we hit the "Just leak it" scenario, we can still safe= ly=0D > deallocate the backing storage of `srcu_struct` and nothing should break?= =0D > =0D > >=0D > >> * Trigger a `BUG` when the leak condition is hit for Rust users.=0D > >=0D > > We need an atomic counter to detect the leak and I thought that would b= e too=0D > > much overhead for this abstraction. Basically each lock and drop will c= all an=0D > > atomic operation so.=0D > =0D > You could just check if srcu_sup is NULL after calling `cleanup_srcu_stru= ct`.=0D > =0D > Best,=0D > Gary=0D > =0D > >=0D > >> * Declare the `WARN_ON` to be a sufficient protection and say this can= be=0D > >> considered safe. Kinda similar to the strategy we take to the=0D > >> sleep-inside-atomic context issue.=0D > >=0D > > I think this is a rather weak precaution.=0D > >=0D > =0D =0D Alright, here is the alternative approach I just figured and I think this m= akes=0D the most sense compared to the ones we discussed in this series:=0D =0D #[pinned_drop]=0D impl PinnedDrop for Srcu {=0D fn drop(self: Pin<&mut Self>) {=0D let ptr =3D self.inner.get();=0D =0D // `cleanup_srcu_struct()` may return early if readers are still active.= Because `Srcu`=0D // owns the embedded `srcu_struct`, returning from `drop` in that state = could free memory=0D // that is still referenced by the C side.=0D //=0D // Wait for all readers to complete first. If any `Guard` was leaked, `s= ynchronize_srcu()`=0D // will sleep forever.=0D //=0D // SAFETY: By the type invariants, `self` contains a valid and pinned `s= truct srcu_struct`.=0D unsafe { bindings::synchronize_srcu(ptr) };=0D =0D // ...=0D =0D (the code snippet above is copied from [1])=0D =0D As explained in the doc comments, we avoid the potential UAFs by sleeping=0D forever which is considered non-unsafe, right? Alice said during today's=0D call that "sleeping is not good, but it is not unsafe". If that's the case,= =0D I think this fixes the overall problem and we can make read_unlock safe aga= in.=0D =0D What do you think? I will hold the next version until we reach a common poi= nt.=0D =0D [1]: https://github.com/onur-ozkan/linux/commit/28d9739f03=0D =0D - Onur=0D