public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: Boris Brezillon <boris.brezillon@collabora.com>
To: "Onur Özkan" <work@onurozkan.dev>
Cc: Alice Ryhl <aliceryhl@google.com>,
	rust-for-linux@vger.kernel.org, linux-kernel@vger.kernel.org,
	ojeda@kernel.org, boqun@kernel.org, gary@garyguo.net,
	bjorn3_gh@protonmail.com, lossin@kernel.org,
	a.hindborg@kernel.org, tmgross@umich.edu, dakr@kernel.org,
	tamird@kernel.org, daniel.almeida@collabora.com
Subject: Re: [PATCH v2 1/1] rust: add Work::disable_sync
Date: Tue, 5 May 2026 13:20:10 +0200	[thread overview]
Message-ID: <20260505132010.6acaf15a@fedora> (raw)
In-Reply-To: <20260505101126.92832-1-work@onurozkan.dev>

On Tue,  5 May 2026 13:10:32 +0300
Onur Özkan <work@onurozkan.dev> wrote:

> On Tue, 05 May 2026 11:50:14 +0200
> Boris Brezillon <boris.brezillon@collabora.com> wrote:
> 
> > On Tue,  5 May 2026 12:16:12 +0300
> > Onur Özkan <work@onurozkan.dev> wrote:
> >   
> > > On Tue, 05 May 2026 08:47:45 +0000
> > > Alice Ryhl <aliceryhl@google.com> wrote:
> > >   
> > > > On Tue, May 05, 2026 at 09:07:19AM +0300, Onur Özkan wrote:    
> > > > > On Mon, 04 May 2026 07:54:54 +0000
> > > > > Alice Ryhl <aliceryhl@google.com> wrote:
> > > > >     
> > > > > > On Fri, May 01, 2026 at 10:11:22PM +0300, Onur Özkan wrote:    
> > > > > > > Adds Work::disable_sync() as a safe wrapper for disable_work_sync().
> > > > > > > 
> > > > > > > Drivers can use this during teardown to stop new queueing and wait for
> > > > > > > queued or running work to finish before dropping related resources.
> > > > > > > 
> > > > > > > Signed-off-by: Onur Özkan <work@onurozkan.dev>
> > > > > > > ---
> > > > > > >  rust/kernel/workqueue.rs | 121 ++++++++++++++++++++++++++-------------
> > > > > > >  1 file changed, 81 insertions(+), 40 deletions(-)
> > > > > > > 
> > > > > > > diff --git a/rust/kernel/workqueue.rs b/rust/kernel/workqueue.rs
> > > > > > > index 7e253b6f299c..d0f9b4ba7f27 100644
> > > > > > > --- a/rust/kernel/workqueue.rs
> > > > > > > +++ b/rust/kernel/workqueue.rs
> > > > > > > @@ -267,7 +267,7 @@ pub unsafe fn from_raw<'a>(ptr: *const bindings::workqueue_struct) -> &'a Queue
> > > > > > >  
> > > > > > >      /// Enqueues a work item.
> > > > > > >      ///
> > > > > > > -    /// This may fail if the work item is already enqueued in a workqueue.
> > > > > > > +    /// This may fail if the work item is already enqueued in a workqueue or disabled.
> > > > > > >      ///
> > > > > > >      /// The work item will be submitted using `WORK_CPU_UNBOUND`.
> > > > > > >      pub fn enqueue<W, const ID: u64>(&self, w: W) -> W::EnqueueOutput    
> > > > > > 
> > > > > > Can you elaborate on the case where disable leads to failure here? Can
> > > > > > you not enqueue a work item again after disabling it? Is there a doc
> > > > > > test illustrating this case that I can run for myself to see the
> > > > > > behavior in action?    
> > > > > 
> > > > > As we discussed on yesterday's call, I looked into cancel_work_sync and
> > > > > it seems we can make this work in the tyr reset implementation. We already
> > > > > store an atomic reset state, it can be used to prevent future enqueues in
> > > > > reset scheduling.
> > > > > 
> > > > > I will send a patch for the cancel_sync function soon.    
> > > > 
> > > > I did hear from Boris that Panthor actually does make use of the disable
> > > > feature.     
> > > 
> > > I don't have any idea what Boris said, perhaps he should write it here as well.
> > > Maybe Boris said something that isn't covered on the tyr reset yet in my series,
> > > I don't know.  
> > 
> > So, I checked where those disable_[delayed_]work[_sync]() are, and they
> > seem to cover mostly the unplug path. There's a couple non-unplug
> > related use, which both cover the per-queue watchdog[3][4].
> > 
> > As for why we ended up using disable_work instead of cancel_work, it's
> > all explained in [1] and [2]. Yes, it could have been done differently,
> > but disable_work was the most convenient way of solving these UAFs in C.
> >   
> > > 
> > > What I do know is that I can achieve essentially the same behavior using
> > > cancel_work_sync instead of disable_work_sync on tyr. Like i said there's an
> > > atomic reset state we can use to prevent future work from being enqueued.  
> > 
> > If the atomic is already there to prevent queuing more works, or
> > checking if a reset is pending, sure. It might just be more custom
> > checks to add, and if we think we'll need to support work items that can
> > be disabled further down the line anyway, maybe it makes sense to work
> > on it now, dunno.  
> 
> With the disable_work_sync path, there are the 2 things I don't really like:
> 
> 	1. Having multiple sources of control for the reset path: one being the
> 	   atomic state and the other the workqueue. It feels odd to have both at
> 	   the same time.

The source of truth would still be the reset_pending atomic. The
question is more, how do you ensure the work items that are covered by
this reset_pending state can't be scheduled until this reset_pending is
restored to false? In panthor we do that with the
sched_queue_[delayed_]work() helpers, which works okay as long as you
ensure all sched-related works go through this helper. Using
disable_work() would provide a more robust solution. In rust it's
likely that you have other ways to enforce that.

> 	2. The added complexity on workqueue.rs
> 
> Regarding 1, IMO only one mechanism should own this logic. If we have to choose,
> we can't drop the atomic state since it's also useful in other parts. It's also
> more explicit and easier to understand (devs wouldn't need to know the workqueue
> internals to understand the scheduling behavior).

I agree, but it still relies on the fact you have a
thread-safe/non-blocking way to enforce that no one will ever be able to
reschedule a work item between the moment you call cancel_work_sync()
on the various work items you want to stop before doing the reset, and
the moment you're done with the reset. {disable,enable}_work() provides
a generic+race-free solution for that, which is why we used it in the
unplug path in panthor.

> 
> Regarding 2, Alice's suggestion at [1] seems quite complicated (even without the
> enable part) compare to how simple cancel_work_sync would be done.

Doesn't seem utterly complicated to me, but I'm probably not the best
person to comment on rust code. BTW, how is cancel_work_sync()
different in term of what it means for Box containers? You still can't
cancel a work item if it's owned by the workqueue, can you? So all this
really changes is the fact you have to re-enable a disabled work item
before enqueuing it again if you want it to do anything.

BTW, in panthor, we also heavily rely on queue_work() not doing anything
and returning false if the work item is already scheduled, so accepting
::enqueue() failures in Tyr is going to be standard behavior, I think.

> 
> I think I will go with the cancel_work_sync + atomic state approach. At least until
> we run into a case where it doesn't work (which I cannot guess at the moment) and
> disable_work_sync becomes necessary.

Sounds good.

  reply	other threads:[~2026-05-05 11:20 UTC|newest]

Thread overview: 10+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-05-01 19:11 [PATCH v2 0/1] rust: add Work::disable_sync Onur Özkan
2026-05-01 19:11 ` [PATCH v2 1/1] " Onur Özkan
2026-05-04  7:54   ` Alice Ryhl
2026-05-05  6:07     ` Onur Özkan
2026-05-05  8:47       ` Alice Ryhl
2026-05-05  9:16         ` Onur Özkan
2026-05-05  9:50           ` Boris Brezillon
2026-05-05 10:10             ` Onur Özkan
2026-05-05 11:20               ` Boris Brezillon [this message]
2026-05-04  7:55 ` [PATCH v2 0/1] " Alice Ryhl

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20260505132010.6acaf15a@fedora \
    --to=boris.brezillon@collabora.com \
    --cc=a.hindborg@kernel.org \
    --cc=aliceryhl@google.com \
    --cc=bjorn3_gh@protonmail.com \
    --cc=boqun@kernel.org \
    --cc=dakr@kernel.org \
    --cc=daniel.almeida@collabora.com \
    --cc=gary@garyguo.net \
    --cc=linux-kernel@vger.kernel.org \
    --cc=lossin@kernel.org \
    --cc=ojeda@kernel.org \
    --cc=rust-for-linux@vger.kernel.org \
    --cc=tamird@kernel.org \
    --cc=tmgross@umich.edu \
    --cc=work@onurozkan.dev \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox