The Linux Kernel Mailing List
 help / color / mirror / Atom feed
* [PATCH] workqueue: release PENDING in __queue_work() drain/destroy reject path
@ 2026-05-07 11:04 Breno Leitao
  2026-05-08 18:04 ` Tejun Heo
  0 siblings, 1 reply; 2+ messages in thread
From: Breno Leitao @ 2026-05-07 11:04 UTC (permalink / raw)
  To: Tejun Heo, Lai Jiangshan; +Cc: linux-kernel, clm, kernel-team, Breno Leitao

The caller of __queue_work() owns WORK_STRUCT_PENDING, won via
test_and_set_bit() in queue_work_on()/__queue_delayed_work(). The
state machine documented above __queue_work() requires that owner
to either hand the token to a pwq (insert_work() -> set_work_pwq()),
hand it to a timer, or release it via set_work_pool_and_clear_pending().
try_to_grab_pending() relies on this: when it observes
"PENDING && off-queue" it busy-loops, trusting the current owner to
make progress.

The (__WQ_DESTROYING | __WQ_DRAINING) early-return path violates that
contract. It WARN_ONCE()s and bare-returns, leaving work->data with
PENDING set, WORK_STRUCT_PWQ clear, and work->entry empty.

The path is reachable without explicit API abuse: queue_delayed_work()
arms a timer with PENDING set; if drain_workqueue() runs while the
timer is still pending, delayed_work_timer_fn() -> __queue_work() in
softirq context hits the WARN, current is not a wq worker so
is_chained_work() is false, and the work is silently dropped with
PENDING leaked.

Mirror what clear_pending_if_disabled() already does on its analogous
reject path: unpack the off-queue data and call
set_work_pool_and_clear_pending() to release the token before
returning.

I was able to reproduce this by queueing several slow works on
a max_active=1 wq, arm a delayed_work whose timer fires while
drain_workqueue() is blocked, then call cancel_delayed_work_sync().
Without this patch the cancel livelocks at 100% CPU; with it the cancel
returns immediately.

Signed-off-by: Breno Leitao <leitao@debian.org>
---
not sure you want to have a Fixes tag, but, if you do, I would point it
to Fixes: e41e704bc4f4 ("workqueue: improve destroy_workqueue() debuggability")
---
 kernel/workqueue.c | 12 ++++++++++++
 1 file changed, 12 insertions(+)

diff --git a/kernel/workqueue.c b/kernel/workqueue.c
index 2506b5cfbb133..885be263b2825 100644
--- a/kernel/workqueue.c
+++ b/kernel/workqueue.c
@@ -2296,6 +2296,18 @@ static void __queue_work(int cpu, struct workqueue_struct *wq,
 	if (unlikely(wq->flags & (__WQ_DESTROYING | __WQ_DRAINING) &&
 		     WARN_ONCE(!is_chained_work(wq), "workqueue: cannot queue %ps on wq %s\n",
 			       work->func, wq->name))) {
+		struct work_offq_data offqd;
+
+		/*
+		 * State on entry: PENDING is set, work is off-queue (no
+		 * insert_work() has run).
+		 *
+		 * Returning without clearing PENDING would leave the work
+		 * in a weird state (PENDING=1, PWQ=0, entry empty)
+		 */
+		work_offqd_unpack(&offqd, *work_data_bits(work));
+		set_work_pool_and_clear_pending(work, offqd.pool_id,
+						work_offqd_pack_flags(&offqd));
 		return;
 	}
 	rcu_read_lock();

---
base-commit: 735d2f48cadaa9a87e7c7601667878de70c771c5
change-id: 20260507-workqueue_pending-b91beb94ef46

Best regards,
--  
Breno Leitao <leitao@debian.org>


^ permalink raw reply related	[flat|nested] 2+ messages in thread

* Re: [PATCH] workqueue: release PENDING in __queue_work() drain/destroy reject path
  2026-05-07 11:04 [PATCH] workqueue: release PENDING in __queue_work() drain/destroy reject path Breno Leitao
@ 2026-05-08 18:04 ` Tejun Heo
  0 siblings, 0 replies; 2+ messages in thread
From: Tejun Heo @ 2026-05-08 18:04 UTC (permalink / raw)
  To: Breno Leitao, Lai Jiangshan; +Cc: linux-kernel, clm, kernel-team

Hello,

On Thu, 07 May 2026 04:04:46 -0700, Breno Leitao wrote:
> The caller of __queue_work() owns WORK_STRUCT_PENDING, won via
> test_and_set_bit() in queue_work_on()/__queue_delayed_work(). The
> state machine documented above __queue_work() requires that owner
> to either hand the token to a pwq (insert_work() -> set_work_pwq()),
> hand it to a timer, or release it via set_work_pool_and_clear_pending().
> try_to_grab_pending() relies on this: when it observes
> "PENDING && off-queue" it busy-loops, trusting the current owner to
> make progress.
>
> [...]

Applied to wq/for-7.1-fixes (capitalized the first word of the
subject).

Thanks.

--
tejun

^ permalink raw reply	[flat|nested] 2+ messages in thread

end of thread, other threads:[~2026-05-08 18:04 UTC | newest]

Thread overview: 2+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-05-07 11:04 [PATCH] workqueue: release PENDING in __queue_work() drain/destroy reject path Breno Leitao
2026-05-08 18:04 ` Tejun Heo

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox