Re: [PATCH v2 1/1] rust: add Work::disable_sync

All of lore.kernel.org
 help / color / mirror / Atom feed

From: Boris Brezillon <boris.brezillon@collabora.com>
To: "Onur Özkan" <work@onurozkan.dev>
Cc: Alice Ryhl <aliceryhl@google.com>,
	rust-for-linux@vger.kernel.org, linux-kernel@vger.kernel.org,
	ojeda@kernel.org, boqun@kernel.org, gary@garyguo.net,
	bjorn3_gh@protonmail.com, lossin@kernel.org,
	a.hindborg@kernel.org, tmgross@umich.edu, dakr@kernel.org,
	tamird@kernel.org, daniel.almeida@collabora.com
Subject: Re: [PATCH v2 1/1] rust: add Work::disable_sync
Date: Tue, 5 May 2026 13:20:10 +0200	[thread overview]
Message-ID: <20260505132010.6acaf15a@fedora> (raw)
In-Reply-To: <20260505101126.92832-1-work@onurozkan.dev>

On Tue,  5 May 2026 13:10:32 +0300
Onur Özkan <work@onurozkan.dev> wrote:

> On Tue, 05 May 2026 11:50:14 +0200
> Boris Brezillon <boris.brezillon@collabora.com> wrote:
> 
> > On Tue,  5 May 2026 12:16:12 +0300
> > Onur Özkan <work@onurozkan.dev> wrote:
> >   
> > > On Tue, 05 May 2026 08:47:45 +0000
> > > Alice Ryhl <aliceryhl@google.com> wrote:
> > >   
> > > > On Tue, May 05, 2026 at 09:07:19AM +0300, Onur Özkan wrote:    
> > > > > On Mon, 04 May 2026 07:54:54 +0000
> > > > > Alice Ryhl <aliceryhl@google.com> wrote:
> > > > >     
> > > > > > On Fri, May 01, 2026 at 10:11:22PM +0300, Onur Özkan wrote:    
> > > > > > > Adds Work::disable_sync() as a safe wrapper for disable_work_sync().
> > > > > > > 
> > > > > > > Drivers can use this during teardown to stop new queueing and wait for
> > > > > > > queued or running work to finish before dropping related resources.
> > > > > > > 
> > > > > > > Signed-off-by: Onur Özkan <work@onurozkan.dev>
> > > > > > > ---
> > > > > > >  rust/kernel/workqueue.rs | 121 ++++++++++++++++++++++++++-------------
> > > > > > >  1 file changed, 81 insertions(+), 40 deletions(-)
> > > > > > > 
> > > > > > > diff --git a/rust/kernel/workqueue.rs b/rust/kernel/workqueue.rs
> > > > > > > index 7e253b6f299c..d0f9b4ba7f27 100644
> > > > > > > --- a/rust/kernel/workqueue.rs
> > > > > > > +++ b/rust/kernel/workqueue.rs
> > > > > > > @@ -267,7 +267,7 @@ pub unsafe fn from_raw<'a>(ptr: *const bindings::workqueue_struct) -> &'a Queue
> > > > > > >  
> > > > > > >      /// Enqueues a work item.
> > > > > > >      ///
> > > > > > > -    /// This may fail if the work item is already enqueued in a workqueue.
> > > > > > > +    /// This may fail if the work item is already enqueued in a workqueue or disabled.
> > > > > > >      ///
> > > > > > >      /// The work item will be submitted using `WORK_CPU_UNBOUND`.
> > > > > > >      pub fn enqueue<W, const ID: u64>(&self, w: W) -> W::EnqueueOutput    
> > > > > > 
> > > > > > Can you elaborate on the case where disable leads to failure here? Can
> > > > > > you not enqueue a work item again after disabling it? Is there a doc
> > > > > > test illustrating this case that I can run for myself to see the
> > > > > > behavior in action?    
> > > > > 
> > > > > As we discussed on yesterday's call, I looked into cancel_work_sync and
> > > > > it seems we can make this work in the tyr reset implementation. We already
> > > > > store an atomic reset state, it can be used to prevent future enqueues in
> > > > > reset scheduling.
> > > > > 
> > > > > I will send a patch for the cancel_sync function soon.    
> > > > 
> > > > I did hear from Boris that Panthor actually does make use of the disable
> > > > feature.     
> > > 
> > > I don't have any idea what Boris said, perhaps he should write it here as well.
> > > Maybe Boris said something that isn't covered on the tyr reset yet in my series,
> > > I don't know.  
> > 
> > So, I checked where those disable_[delayed_]work[_sync]() are, and they
> > seem to cover mostly the unplug path. There's a couple non-unplug
> > related use, which both cover the per-queue watchdog[3][4].
> > 
> > As for why we ended up using disable_work instead of cancel_work, it's
> > all explained in [1] and [2]. Yes, it could have been done differently,
> > but disable_work was the most convenient way of solving these UAFs in C.
> >   
> > > 
> > > What I do know is that I can achieve essentially the same behavior using
> > > cancel_work_sync instead of disable_work_sync on tyr. Like i said there's an
> > > atomic reset state we can use to prevent future work from being enqueued.  
> > 
> > If the atomic is already there to prevent queuing more works, or
> > checking if a reset is pending, sure. It might just be more custom
> > checks to add, and if we think we'll need to support work items that can
> > be disabled further down the line anyway, maybe it makes sense to work
> > on it now, dunno.  
> 
> With the disable_work_sync path, there are the 2 things I don't really like:
> 
> 	1. Having multiple sources of control for the reset path: one being the
> 	   atomic state and the other the workqueue. It feels odd to have both at
> 	   the same time.

The source of truth would still be the reset_pending atomic. The
question is more, how do you ensure the work items that are covered by
this reset_pending state can't be scheduled until this reset_pending is
restored to false? In panthor we do that with the
sched_queue_[delayed_]work() helpers, which works okay as long as you
ensure all sched-related works go through this helper. Using
disable_work() would provide a more robust solution. In rust it's
likely that you have other ways to enforce that.

> 	2. The added complexity on workqueue.rs
> 
> Regarding 1, IMO only one mechanism should own this logic. If we have to choose,
> we can't drop the atomic state since it's also useful in other parts. It's also
> more explicit and easier to understand (devs wouldn't need to know the workqueue
> internals to understand the scheduling behavior).

I agree, but it still relies on the fact you have a
thread-safe/non-blocking way to enforce that no one will ever be able to
reschedule a work item between the moment you call cancel_work_sync()
on the various work items you want to stop before doing the reset, and
the moment you're done with the reset. {disable,enable}_work() provides
a generic+race-free solution for that, which is why we used it in the
unplug path in panthor.

> 
> Regarding 2, Alice's suggestion at [1] seems quite complicated (even without the
> enable part) compare to how simple cancel_work_sync would be done.

Doesn't seem utterly complicated to me, but I'm probably not the best
person to comment on rust code. BTW, how is cancel_work_sync()
different in term of what it means for Box containers? You still can't
cancel a work item if it's owned by the workqueue, can you? So all this
really changes is the fact you have to re-enable a disabled work item
before enqueuing it again if you want it to do anything.

BTW, in panthor, we also heavily rely on queue_work() not doing anything
and returning false if the work item is already scheduled, so accepting
::enqueue() failures in Tyr is going to be standard behavior, I think.

> 
> I think I will go with the cancel_work_sync + atomic state approach. At least until
> we run into a case where it doesn't work (which I cannot guess at the moment) and
> disable_work_sync becomes necessary.

Sounds good.

next prev parent reply	other threads:[~2026-05-05 11:20 UTC|newest]

Thread overview: 11+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-05-01 19:11 [PATCH v2 0/1] rust: add Work::disable_sync Onur Özkan
2026-05-01 19:11 ` [PATCH v2 1/1] " Onur Özkan
2026-05-04  7:54   ` Alice Ryhl
2026-05-05  6:07     ` Onur Özkan
2026-05-05  8:47       ` Alice Ryhl
2026-05-05  9:16         ` Onur Özkan
2026-05-05  9:50           ` Boris Brezillon
2026-05-05 10:10             ` Onur Özkan
2026-05-05 11:20               ` Boris Brezillon [this message]
2026-05-06  7:05                 ` Onur Özkan
2026-05-04  7:55 ` [PATCH v2 0/1] " Alice Ryhl

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20260505132010.6acaf15a@fedora \
    --to=boris.brezillon@collabora.com \
    --cc=a.hindborg@kernel.org \
    --cc=aliceryhl@google.com \
    --cc=bjorn3_gh@protonmail.com \
    --cc=boqun@kernel.org \
    --cc=dakr@kernel.org \
    --cc=daniel.almeida@collabora.com \
    --cc=gary@garyguo.net \
    --cc=linux-kernel@vger.kernel.org \
    --cc=lossin@kernel.org \
    --cc=ojeda@kernel.org \
    --cc=rust-for-linux@vger.kernel.org \
    --cc=tamird@kernel.org \
    --cc=tmgross@umich.edu \
    --cc=work@onurozkan.dev \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.