All of lore.kernel.org
 help / color / mirror / Atom feed
From: Bart Van Assche <bvanassche@acm.org>
To: Tejun Heo <tj@kernel.org>
Cc: linux-kernel@vger.kernel.org,
	Johannes Berg <johannes.berg@intel.com>,
	Christoph Hellwig <hch@lst.de>, Sagi Grimberg <sagi@grimberg.me>,
	tytso@mit.edu, bvanassche@acm.org
Subject: [PATCH 3/3] kernel/workqueue: Suppress a false positive lockdep complaint
Date: Thu, 25 Oct 2018 08:05:40 -0700	[thread overview]
Message-ID: <20181025150540.259281-4-bvanassche@acm.org> (raw)
In-Reply-To: <20181025150540.259281-1-bvanassche@acm.org>

It can happen that the direct I/O queue creates and destroys an empty
workqueue from inside a work function. Avoid that this triggers the
false positive lockdep complaint shown below.

======================================================
WARNING: possible circular locking dependency detected
4.19.0-dbg+ #1 Not tainted
------------------------------------------------------
fio/4129 is trying to acquire lock:
00000000a01cfe1a ((wq_completion)"dio/%s"sb->s_id){+.+.}, at: flush_workqueue+0xd0/0x970

but task is already holding lock:
00000000a0acecf9 (&sb->s_type->i_mutex_key#14){+.+.}, at: ext4_file_write_iter+0x154/0x710

which lock already depends on the new lock.

the existing dependency chain (in reverse order) is:

-> #2 (&sb->s_type->i_mutex_key#14){+.+.}:
       down_write+0x3d/0x80
       __generic_file_fsync+0x77/0xf0
       ext4_sync_file+0x3c9/0x780
       vfs_fsync_range+0x66/0x100
       dio_complete+0x2f5/0x360
       dio_aio_complete_work+0x1c/0x20
       process_one_work+0x481/0x9f0
       worker_thread+0x63/0x5a0
       kthread+0x1cf/0x1f0
       ret_from_fork+0x24/0x30

-> #1 ((work_completion)(&dio->complete_work)){+.+.}:
       process_one_work+0x447/0x9f0
       worker_thread+0x63/0x5a0
       kthread+0x1cf/0x1f0
       ret_from_fork+0x24/0x30

-> #0 ((wq_completion)"dio/%s"sb->s_id){+.+.}:
       lock_acquire+0xc5/0x200
       flush_workqueue+0xf3/0x970
       drain_workqueue+0xec/0x220
       destroy_workqueue+0x23/0x350
       sb_init_dio_done_wq+0x6a/0x80
       do_blockdev_direct_IO+0x1f33/0x4be0
       __blockdev_direct_IO+0x79/0x86
       ext4_direct_IO+0x5df/0xbb0
       generic_file_direct_write+0x119/0x220
       __generic_file_write_iter+0x131/0x2d0
       ext4_file_write_iter+0x3fa/0x710
       aio_write+0x235/0x330
       io_submit_one+0x510/0xeb0
       __x64_sys_io_submit+0x122/0x340
       do_syscall_64+0x71/0x220
       entry_SYSCALL_64_after_hwframe+0x49/0xbe

other info that might help us debug this:

Chain exists of:
  (wq_completion)"dio/%s"sb->s_id --> (work_completion)(&dio->complete_work) --> &sb->s_type->i_mutex_key#14

 Possible unsafe locking scenario:

       CPU0                    CPU1
       ----                    ----
  lock(&sb->s_type->i_mutex_key#14);
                               lock((work_completion)(&dio->complete_work));
                               lock(&sb->s_type->i_mutex_key#14);
  lock((wq_completion)"dio/%s"sb->s_id);

 *** DEADLOCK ***

1 lock held by fio/4129:
 #0: 00000000a0acecf9 (&sb->s_type->i_mutex_key#14){+.+.}, at: ext4_file_write_iter+0x154/0x710

stack backtrace:
CPU: 3 PID: 4129 Comm: fio Not tainted 4.19.0-dbg+ #1
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.10.2-1 04/01/2014
Call Trace:
 dump_stack+0x86/0xc5
 print_circular_bug.isra.32+0x20a/0x218
 __lock_acquire+0x1c68/0x1cf0
 lock_acquire+0xc5/0x200
 flush_workqueue+0xf3/0x970
 drain_workqueue+0xec/0x220
 destroy_workqueue+0x23/0x350
 sb_init_dio_done_wq+0x6a/0x80
 do_blockdev_direct_IO+0x1f33/0x4be0
 __blockdev_direct_IO+0x79/0x86
 ext4_direct_IO+0x5df/0xbb0
 generic_file_direct_write+0x119/0x220
 __generic_file_write_iter+0x131/0x2d0
 ext4_file_write_iter+0x3fa/0x710
 aio_write+0x235/0x330
 io_submit_one+0x510/0xeb0
 __x64_sys_io_submit+0x122/0x340
 do_syscall_64+0x71/0x220
 entry_SYSCALL_64_after_hwframe+0x49/0xbe

Cc: Johannes Berg <johannes.berg@intel.com>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Sagi Grimberg <sagi@grimberg.me>
Signed-off-by: Bart Van Assche <bvanassche@acm.org>
---
 include/linux/workqueue.h | 1 +
 kernel/workqueue.c        | 6 +++++-
 2 files changed, 6 insertions(+), 1 deletion(-)

diff --git a/include/linux/workqueue.h b/include/linux/workqueue.h
index 60d673e15632..375ec764f148 100644
--- a/include/linux/workqueue.h
+++ b/include/linux/workqueue.h
@@ -344,6 +344,7 @@ enum {
 	__WQ_ORDERED		= 1 << 17, /* internal: workqueue is ordered */
 	__WQ_LEGACY		= 1 << 18, /* internal: create*_workqueue() */
 	__WQ_ORDERED_EXPLICIT	= 1 << 19, /* internal: alloc_ordered_workqueue() */
+	__WQ_HAS_BEEN_USED	= 1 << 20, /* internal: work has been queued */
 
 	WQ_MAX_ACTIVE		= 512,	  /* I like 512, better ideas? */
 	WQ_MAX_UNBOUND_PER_CPU	= 4,	  /* 4 * #cpus for unbound wq */
diff --git a/kernel/workqueue.c b/kernel/workqueue.c
index fc9129d5909e..0ef275fe526c 100644
--- a/kernel/workqueue.c
+++ b/kernel/workqueue.c
@@ -1383,6 +1383,10 @@ static void __queue_work(int cpu, struct workqueue_struct *wq,
 	if (unlikely(wq->flags & __WQ_DRAINING) &&
 	    WARN_ON_ONCE(!is_chained_work(wq)))
 		return;
+
+	if (!(wq->flags & __WQ_HAS_BEEN_USED))
+		wq->flags |= __WQ_HAS_BEEN_USED;
+
 retry:
 	if (req_cpu == WORK_CPU_UNBOUND)
 		cpu = wq_select_unbound_cpu(raw_smp_processor_id());
@@ -2889,7 +2893,7 @@ static bool start_flush_work(struct work_struct *work, struct wq_barrier *barr,
 	 * workqueues the deadlock happens when the rescuer stalls, blocking
 	 * forward progress.
 	 */
-	if (!from_cancel &&
+	if (!from_cancel && (pwq->wq->flags & __WQ_HAS_BEEN_USED) &&
 	    (pwq->wq->saved_max_active == 1 || pwq->wq->rescuer)) {
 		lock_acquire_exclusive(&pwq->wq->lockdep_map, 0, 0, NULL,
 				       _THIS_IP_);
-- 
2.19.1.568.g152ad8e336-goog


  parent reply	other threads:[~2018-10-25 15:07 UTC|newest]

Thread overview: 26+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-10-25 15:05 [PATCH 0/3] Suppress false positives triggered by workqueue lockdep annotations Bart Van Assche
2018-10-25 15:05 ` [PATCH 1/3] kernel/workqueue: Remove lockdep annotation from __flush_work() Bart Van Assche
2018-10-25 15:31   ` Johannes Berg
2018-10-25 15:57     ` Johannes Berg
2018-10-25 16:01     ` Bart Van Assche
2018-10-25 15:05 ` [PATCH 2/3] kernel/workqueue: Surround work execution with shared lock annotations Bart Van Assche
2018-10-25 16:53   ` Johannes Berg
2018-10-25 17:22     ` Bart Van Assche
2018-10-25 19:17       ` Johannes Berg
2018-10-25 15:05 ` Bart Van Assche [this message]
2018-10-25 15:34   ` [PATCH 3/3] kernel/workqueue: Suppress a false positive lockdep complaint Johannes Berg
2018-10-25 15:55     ` Bart Van Assche
2018-10-25 19:59       ` Johannes Berg
2018-10-25 20:21         ` Theodore Y. Ts'o
2018-10-25 20:26           ` Johannes Berg
2018-10-25 15:36   ` Tejun Heo
2018-10-25 15:37     ` Tejun Heo
2018-10-25 20:13     ` Johannes Berg
2018-10-25 15:40   ` Theodore Y. Ts'o
2018-10-25 17:02   ` Johannes Berg
2018-10-25 17:11     ` Bart Van Assche
2018-10-25 19:51       ` Johannes Berg
2018-10-25 20:39         ` Bart Van Assche
2018-10-25 20:47           ` Johannes Berg
2018-10-25 15:27 ` [PATCH 0/3] Suppress false positives triggered by workqueue lockdep annotations Johannes Berg
2018-10-25 15:47   ` Bart Van Assche

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20181025150540.259281-4-bvanassche@acm.org \
    --to=bvanassche@acm.org \
    --cc=hch@lst.de \
    --cc=johannes.berg@intel.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=sagi@grimberg.me \
    --cc=tj@kernel.org \
    --cc=tytso@mit.edu \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.