* [PATCH] workqueue: release PENDING in __queue_work() drain/destroy reject path
@ 2026-05-07 11:04 Breno Leitao
2026-05-08 18:04 ` Tejun Heo
0 siblings, 1 reply; 2+ messages in thread
From: Breno Leitao @ 2026-05-07 11:04 UTC (permalink / raw)
To: Tejun Heo, Lai Jiangshan; +Cc: linux-kernel, clm, kernel-team, Breno Leitao
The caller of __queue_work() owns WORK_STRUCT_PENDING, won via
test_and_set_bit() in queue_work_on()/__queue_delayed_work(). The
state machine documented above __queue_work() requires that owner
to either hand the token to a pwq (insert_work() -> set_work_pwq()),
hand it to a timer, or release it via set_work_pool_and_clear_pending().
try_to_grab_pending() relies on this: when it observes
"PENDING && off-queue" it busy-loops, trusting the current owner to
make progress.
The (__WQ_DESTROYING | __WQ_DRAINING) early-return path violates that
contract. It WARN_ONCE()s and bare-returns, leaving work->data with
PENDING set, WORK_STRUCT_PWQ clear, and work->entry empty.
The path is reachable without explicit API abuse: queue_delayed_work()
arms a timer with PENDING set; if drain_workqueue() runs while the
timer is still pending, delayed_work_timer_fn() -> __queue_work() in
softirq context hits the WARN, current is not a wq worker so
is_chained_work() is false, and the work is silently dropped with
PENDING leaked.
Mirror what clear_pending_if_disabled() already does on its analogous
reject path: unpack the off-queue data and call
set_work_pool_and_clear_pending() to release the token before
returning.
I was able to reproduce this by queueing several slow works on
a max_active=1 wq, arm a delayed_work whose timer fires while
drain_workqueue() is blocked, then call cancel_delayed_work_sync().
Without this patch the cancel livelocks at 100% CPU; with it the cancel
returns immediately.
Signed-off-by: Breno Leitao <leitao@debian.org>
---
not sure you want to have a Fixes tag, but, if you do, I would point it
to Fixes: e41e704bc4f4 ("workqueue: improve destroy_workqueue() debuggability")
---
kernel/workqueue.c | 12 ++++++++++++
1 file changed, 12 insertions(+)
diff --git a/kernel/workqueue.c b/kernel/workqueue.c
index 2506b5cfbb133..885be263b2825 100644
--- a/kernel/workqueue.c
+++ b/kernel/workqueue.c
@@ -2296,6 +2296,18 @@ static void __queue_work(int cpu, struct workqueue_struct *wq,
if (unlikely(wq->flags & (__WQ_DESTROYING | __WQ_DRAINING) &&
WARN_ONCE(!is_chained_work(wq), "workqueue: cannot queue %ps on wq %s\n",
work->func, wq->name))) {
+ struct work_offq_data offqd;
+
+ /*
+ * State on entry: PENDING is set, work is off-queue (no
+ * insert_work() has run).
+ *
+ * Returning without clearing PENDING would leave the work
+ * in a weird state (PENDING=1, PWQ=0, entry empty)
+ */
+ work_offqd_unpack(&offqd, *work_data_bits(work));
+ set_work_pool_and_clear_pending(work, offqd.pool_id,
+ work_offqd_pack_flags(&offqd));
return;
}
rcu_read_lock();
---
base-commit: 735d2f48cadaa9a87e7c7601667878de70c771c5
change-id: 20260507-workqueue_pending-b91beb94ef46
Best regards,
--
Breno Leitao <leitao@debian.org>
^ permalink raw reply related [flat|nested] 2+ messages in thread* Re: [PATCH] workqueue: release PENDING in __queue_work() drain/destroy reject path
2026-05-07 11:04 [PATCH] workqueue: release PENDING in __queue_work() drain/destroy reject path Breno Leitao
@ 2026-05-08 18:04 ` Tejun Heo
0 siblings, 0 replies; 2+ messages in thread
From: Tejun Heo @ 2026-05-08 18:04 UTC (permalink / raw)
To: Breno Leitao, Lai Jiangshan; +Cc: linux-kernel, clm, kernel-team
Hello,
On Thu, 07 May 2026 04:04:46 -0700, Breno Leitao wrote:
> The caller of __queue_work() owns WORK_STRUCT_PENDING, won via
> test_and_set_bit() in queue_work_on()/__queue_delayed_work(). The
> state machine documented above __queue_work() requires that owner
> to either hand the token to a pwq (insert_work() -> set_work_pwq()),
> hand it to a timer, or release it via set_work_pool_and_clear_pending().
> try_to_grab_pending() relies on this: when it observes
> "PENDING && off-queue" it busy-loops, trusting the current owner to
> make progress.
>
> [...]
Applied to wq/for-7.1-fixes (capitalized the first word of the
subject).
Thanks.
--
tejun
^ permalink raw reply [flat|nested] 2+ messages in thread
end of thread, other threads:[~2026-05-08 18:04 UTC | newest]
Thread overview: 2+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-05-07 11:04 [PATCH] workqueue: release PENDING in __queue_work() drain/destroy reject path Breno Leitao
2026-05-08 18:04 ` Tejun Heo
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox