From: Bart Van Assche <bvanassche@acm.org>
To: Tejun Heo <tj@kernel.org>
Cc: linux-kernel@vger.kernel.org,
Johannes Berg <johannes.berg@intel.com>,
Christoph Hellwig <hch@lst.de>, Sagi Grimberg <sagi@grimberg.me>,
tytso@mit.edu, bvanassche@acm.org
Subject: [PATCH 3/3] kernel/workqueue: Suppress a false positive lockdep complaint
Date: Thu, 25 Oct 2018 08:05:40 -0700 [thread overview]
Message-ID: <20181025150540.259281-4-bvanassche@acm.org> (raw)
In-Reply-To: <20181025150540.259281-1-bvanassche@acm.org>
It can happen that the direct I/O queue creates and destroys an empty
workqueue from inside a work function. Avoid that this triggers the
false positive lockdep complaint shown below.
======================================================
WARNING: possible circular locking dependency detected
4.19.0-dbg+ #1 Not tainted
------------------------------------------------------
fio/4129 is trying to acquire lock:
00000000a01cfe1a ((wq_completion)"dio/%s"sb->s_id){+.+.}, at: flush_workqueue+0xd0/0x970
but task is already holding lock:
00000000a0acecf9 (&sb->s_type->i_mutex_key#14){+.+.}, at: ext4_file_write_iter+0x154/0x710
which lock already depends on the new lock.
the existing dependency chain (in reverse order) is:
-> #2 (&sb->s_type->i_mutex_key#14){+.+.}:
down_write+0x3d/0x80
__generic_file_fsync+0x77/0xf0
ext4_sync_file+0x3c9/0x780
vfs_fsync_range+0x66/0x100
dio_complete+0x2f5/0x360
dio_aio_complete_work+0x1c/0x20
process_one_work+0x481/0x9f0
worker_thread+0x63/0x5a0
kthread+0x1cf/0x1f0
ret_from_fork+0x24/0x30
-> #1 ((work_completion)(&dio->complete_work)){+.+.}:
process_one_work+0x447/0x9f0
worker_thread+0x63/0x5a0
kthread+0x1cf/0x1f0
ret_from_fork+0x24/0x30
-> #0 ((wq_completion)"dio/%s"sb->s_id){+.+.}:
lock_acquire+0xc5/0x200
flush_workqueue+0xf3/0x970
drain_workqueue+0xec/0x220
destroy_workqueue+0x23/0x350
sb_init_dio_done_wq+0x6a/0x80
do_blockdev_direct_IO+0x1f33/0x4be0
__blockdev_direct_IO+0x79/0x86
ext4_direct_IO+0x5df/0xbb0
generic_file_direct_write+0x119/0x220
__generic_file_write_iter+0x131/0x2d0
ext4_file_write_iter+0x3fa/0x710
aio_write+0x235/0x330
io_submit_one+0x510/0xeb0
__x64_sys_io_submit+0x122/0x340
do_syscall_64+0x71/0x220
entry_SYSCALL_64_after_hwframe+0x49/0xbe
other info that might help us debug this:
Chain exists of:
(wq_completion)"dio/%s"sb->s_id --> (work_completion)(&dio->complete_work) --> &sb->s_type->i_mutex_key#14
Possible unsafe locking scenario:
CPU0 CPU1
---- ----
lock(&sb->s_type->i_mutex_key#14);
lock((work_completion)(&dio->complete_work));
lock(&sb->s_type->i_mutex_key#14);
lock((wq_completion)"dio/%s"sb->s_id);
*** DEADLOCK ***
1 lock held by fio/4129:
#0: 00000000a0acecf9 (&sb->s_type->i_mutex_key#14){+.+.}, at: ext4_file_write_iter+0x154/0x710
stack backtrace:
CPU: 3 PID: 4129 Comm: fio Not tainted 4.19.0-dbg+ #1
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.10.2-1 04/01/2014
Call Trace:
dump_stack+0x86/0xc5
print_circular_bug.isra.32+0x20a/0x218
__lock_acquire+0x1c68/0x1cf0
lock_acquire+0xc5/0x200
flush_workqueue+0xf3/0x970
drain_workqueue+0xec/0x220
destroy_workqueue+0x23/0x350
sb_init_dio_done_wq+0x6a/0x80
do_blockdev_direct_IO+0x1f33/0x4be0
__blockdev_direct_IO+0x79/0x86
ext4_direct_IO+0x5df/0xbb0
generic_file_direct_write+0x119/0x220
__generic_file_write_iter+0x131/0x2d0
ext4_file_write_iter+0x3fa/0x710
aio_write+0x235/0x330
io_submit_one+0x510/0xeb0
__x64_sys_io_submit+0x122/0x340
do_syscall_64+0x71/0x220
entry_SYSCALL_64_after_hwframe+0x49/0xbe
Cc: Johannes Berg <johannes.berg@intel.com>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Sagi Grimberg <sagi@grimberg.me>
Signed-off-by: Bart Van Assche <bvanassche@acm.org>
---
include/linux/workqueue.h | 1 +
kernel/workqueue.c | 6 +++++-
2 files changed, 6 insertions(+), 1 deletion(-)
diff --git a/include/linux/workqueue.h b/include/linux/workqueue.h
index 60d673e15632..375ec764f148 100644
--- a/include/linux/workqueue.h
+++ b/include/linux/workqueue.h
@@ -344,6 +344,7 @@ enum {
__WQ_ORDERED = 1 << 17, /* internal: workqueue is ordered */
__WQ_LEGACY = 1 << 18, /* internal: create*_workqueue() */
__WQ_ORDERED_EXPLICIT = 1 << 19, /* internal: alloc_ordered_workqueue() */
+ __WQ_HAS_BEEN_USED = 1 << 20, /* internal: work has been queued */
WQ_MAX_ACTIVE = 512, /* I like 512, better ideas? */
WQ_MAX_UNBOUND_PER_CPU = 4, /* 4 * #cpus for unbound wq */
diff --git a/kernel/workqueue.c b/kernel/workqueue.c
index fc9129d5909e..0ef275fe526c 100644
--- a/kernel/workqueue.c
+++ b/kernel/workqueue.c
@@ -1383,6 +1383,10 @@ static void __queue_work(int cpu, struct workqueue_struct *wq,
if (unlikely(wq->flags & __WQ_DRAINING) &&
WARN_ON_ONCE(!is_chained_work(wq)))
return;
+
+ if (!(wq->flags & __WQ_HAS_BEEN_USED))
+ wq->flags |= __WQ_HAS_BEEN_USED;
+
retry:
if (req_cpu == WORK_CPU_UNBOUND)
cpu = wq_select_unbound_cpu(raw_smp_processor_id());
@@ -2889,7 +2893,7 @@ static bool start_flush_work(struct work_struct *work, struct wq_barrier *barr,
* workqueues the deadlock happens when the rescuer stalls, blocking
* forward progress.
*/
- if (!from_cancel &&
+ if (!from_cancel && (pwq->wq->flags & __WQ_HAS_BEEN_USED) &&
(pwq->wq->saved_max_active == 1 || pwq->wq->rescuer)) {
lock_acquire_exclusive(&pwq->wq->lockdep_map, 0, 0, NULL,
_THIS_IP_);
--
2.19.1.568.g152ad8e336-goog
next prev parent reply other threads:[~2018-10-25 15:07 UTC|newest]
Thread overview: 26+ messages / expand[flat|nested] mbox.gz Atom feed top
2018-10-25 15:05 [PATCH 0/3] Suppress false positives triggered by workqueue lockdep annotations Bart Van Assche
2018-10-25 15:05 ` [PATCH 1/3] kernel/workqueue: Remove lockdep annotation from __flush_work() Bart Van Assche
2018-10-25 15:31 ` Johannes Berg
2018-10-25 15:57 ` Johannes Berg
2018-10-25 16:01 ` Bart Van Assche
2018-10-25 15:05 ` [PATCH 2/3] kernel/workqueue: Surround work execution with shared lock annotations Bart Van Assche
2018-10-25 16:53 ` Johannes Berg
2018-10-25 17:22 ` Bart Van Assche
2018-10-25 19:17 ` Johannes Berg
2018-10-25 15:05 ` Bart Van Assche [this message]
2018-10-25 15:34 ` [PATCH 3/3] kernel/workqueue: Suppress a false positive lockdep complaint Johannes Berg
2018-10-25 15:55 ` Bart Van Assche
2018-10-25 19:59 ` Johannes Berg
2018-10-25 20:21 ` Theodore Y. Ts'o
2018-10-25 20:26 ` Johannes Berg
2018-10-25 15:36 ` Tejun Heo
2018-10-25 15:37 ` Tejun Heo
2018-10-25 20:13 ` Johannes Berg
2018-10-25 15:40 ` Theodore Y. Ts'o
2018-10-25 17:02 ` Johannes Berg
2018-10-25 17:11 ` Bart Van Assche
2018-10-25 19:51 ` Johannes Berg
2018-10-25 20:39 ` Bart Van Assche
2018-10-25 20:47 ` Johannes Berg
2018-10-25 15:27 ` [PATCH 0/3] Suppress false positives triggered by workqueue lockdep annotations Johannes Berg
2018-10-25 15:47 ` Bart Van Assche
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20181025150540.259281-4-bvanassche@acm.org \
--to=bvanassche@acm.org \
--cc=hch@lst.de \
--cc=johannes.berg@intel.com \
--cc=linux-kernel@vger.kernel.org \
--cc=sagi@grimberg.me \
--cc=tj@kernel.org \
--cc=tytso@mit.edu \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.