From: Tejun Heo <tj@kernel.org>
To: Chandan Babu R <chandanbabu@kernel.org>
Cc: jiangshanlai@gmail.com, linux-kernel@vger.kernel.org,
linux-xfs@vger.kernel.org
Subject: [PATCH wq/for-6.9] workqueue: Fix pwq->nr_in_flight corruption in try_to_grab_pending()
Date: Sun, 4 Feb 2024 11:14:21 -1000 [thread overview]
Message-ID: <Zb_-LQLY7eRuakfe@slm.duckdns.org> (raw)
In-Reply-To: <87o7cxeehy.fsf@debian-BULLSEYE-live-builder-AMD64>
dd6c3c544126 ("workqueue: Move pwq_dec_nr_in_flight() to the end of work
item handling") relocated pwq_dec_nr_in_flight() after
set_work_pool_and_keep_pending(). However, the latter destroys information
contained in work->data that's needed by pwq_dec_nr_in_flight() including
the flush color. With flush color destroyed, flush_workqueue() can stall
easily when mixed with cancel_work*() usages.
This is easily triggered by running xfstests generic/001 test on xfs:
INFO: task umount:6305 blocked for more than 122 seconds.
...
task:umount state:D stack:13008 pid:6305 tgid:6305 ppid:6301 flags:0x00004000
Call Trace:
<TASK>
__schedule+0x2f6/0xa20
schedule+0x36/0xb0
schedule_timeout+0x20b/0x280
wait_for_completion+0x8a/0x140
__flush_workqueue+0x11a/0x3b0
xfs_inodegc_flush+0x24/0xf0
xfs_unmountfs+0x14/0x180
xfs_fs_put_super+0x3d/0x90
generic_shutdown_super+0x7c/0x160
kill_block_super+0x1b/0x40
xfs_kill_sb+0x12/0x30
deactivate_locked_super+0x35/0x90
deactivate_super+0x42/0x50
cleanup_mnt+0x109/0x170
__cleanup_mnt+0x12/0x20
task_work_run+0x60/0x90
syscall_exit_to_user_mode+0x146/0x150
do_syscall_64+0x5d/0x110
entry_SYSCALL_64_after_hwframe+0x6c/0x74
Fix it by stashing work_data before calling set_work_pool_and_keep_pending()
and using the stashed value for pwq_dec_nr_in_flight().
Signed-off-by: Tejun Heo <tj@kernel.org>
Reported-by: Chandan Babu R <chandanbabu@kernel.org>
Link: http://lkml.kernel.org/r/87o7cxeehy.fsf@debian-BULLSEYE-live-builder-AMD64
Fixes: dd6c3c544126 ("workqueue: Move pwq_dec_nr_in_flight() to the end of work item handling")
---
Hello, Chandan.
Thanks a lot for the report. I could reproduce the problem and verified that
this patch fixes the issue. I'm applying this to wq/for-6.9 but would really
appreciate if you could confirm the fix.
Thanks.
kernel/workqueue.c | 10 ++++++++--
1 file changed, 8 insertions(+), 2 deletions(-)
diff --git a/kernel/workqueue.c b/kernel/workqueue.c
index ffb625db9771..55c9816506b0 100644
--- a/kernel/workqueue.c
+++ b/kernel/workqueue.c
@@ -1999,6 +1999,8 @@ static int try_to_grab_pending(struct work_struct *work, bool is_dwork,
*/
pwq = get_work_pwq(work);
if (pwq && pwq->pool == pool) {
+ unsigned long work_data;
+
debug_work_deactivate(work);
/*
@@ -2016,11 +2018,15 @@ static int try_to_grab_pending(struct work_struct *work, bool is_dwork,
list_del_init(&work->entry);
- /* work->data points to pwq iff queued, point to pool */
+ /*
+ * work->data points to pwq iff queued. Let's point to pool. As
+ * this destroys work->data needed by the next step, stash it.
+ */
+ work_data = *work_data_bits(work);
set_work_pool_and_keep_pending(work, pool->id);
/* must be the last step, see the function comment */
- pwq_dec_nr_in_flight(pwq, *work_data_bits(work));
+ pwq_dec_nr_in_flight(pwq, work_data);
raw_spin_unlock(&pool->lock);
rcu_read_unlock();
next prev parent reply other threads:[~2024-02-04 21:14 UTC|newest]
Thread overview: 3+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-02-03 15:39 [BUG REPORT] workqueue: Hung task reported when executing generic/001 on XFS on next-20240202 Chandan Babu R
2024-02-04 21:14 ` Tejun Heo [this message]
2024-02-05 11:55 ` [PATCH wq/for-6.9] workqueue: Fix pwq->nr_in_flight corruption in try_to_grab_pending() Chandan Babu R
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=Zb_-LQLY7eRuakfe@slm.duckdns.org \
--to=tj@kernel.org \
--cc=chandanbabu@kernel.org \
--cc=jiangshanlai@gmail.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-xfs@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.