Re: possible deadlock in flush_workqueue (2)

All of lore.kernel.org
 help / color / mirror / Atom feed

From: "Theodore Y. Ts'o" <tytso@mit.edu>
To: syzbot <syzbot+a50c7541a4a55cd49b02@syzkaller.appspotmail.com>
Cc: adilger.kernel@dilger.ca, linux-ext4@vger.kernel.org,
	linux-kernel@vger.kernel.org, syzkaller-bugs@googlegroups.com,
	linux-fsdevel@vger.kernel.org, tj@kernel.org,
	jiangshanlai@gmail.com
Subject: Re: possible deadlock in flush_workqueue (2)
Date: Tue, 23 Oct 2018 10:28:04 -0400	[thread overview]
Message-ID: <20181023142804.GC6850@thunk.org> (raw)
In-Reply-To: <0000000000009bcd4a0578e22ebd@google.com>

On Tue, Oct 23, 2018 at 02:42:03AM -0700, syzbot wrote:
> syzbot has found a reproducer for the following crash on:
> 
> HEAD commit:    ca9eb48fe01f Merge tag 'regulator-v5.0' of git://git.kerne..
> git tree:       upstream
> console output: https://syzkaller.appspot.com/x/log.txt?x=11a60939400000
> kernel config:  https://syzkaller.appspot.com/x/.config?x=963b24abf3f7c2d8
> dashboard link: https://syzkaller.appspot.com/bug?extid=a50c7541a4a55cd49b02
> compiler:       gcc (GCC) 8.0.1 20180413 (experimental)
> syz repro:      https://syzkaller.appspot.com/x/repro.syz?x=12097f03400000
> C reproducer:   https://syzkaller.appspot.com/x/repro.c?x=10b55ac5400000
> 
> IMPORTANT: if you fix the bug, please add the following tag to the commit:
> Reported-by: syzbot+a50c7541a4a55cd49b02@syzkaller.appspotmail.com

This is a false positive in lockdep, but it's not clear to me what's
the best way to fix it.  It's not ext4-specific, so I've added
linux-fsdevel, Tejun, and Lai to the cc.

The problem is that when we first start doing direct I/O in a
particular file system, we use a lockless technique to allocate the
dio workqueue.  From fs/direct-io.c:

int sb_init_dio_done_wq(struct super_block *sb)
{
	struct workqueue_struct *old;
	struct workqueue_struct *wq = alloc_workqueue("dio/%s",
						      WQ_MEM_RECLAIM, 0,
						      sb->s_id);
	if (!wq)
		return -ENOMEM;
	/*
	 * This has to be atomic as more DIOs can race to create the workqueue
	 */
	old = cmpxchg(&sb->s_dio_done_wq, NULL, wq);
	/* Someone created workqueue before us? Free ours... */
	if (old)
		destroy_workqueue(wq);
	return 0;
}

If we have multiple DIO writers racing to initialize
sb->s_dio_done_wq, the losers of the race will destroy the fresly
created workqueue.  Because it's freshly created, there's nothing to
drain --- but lockdep doesn't know that.

One way I suppose we could solve this is to call sb_init_dio_done_wq
on open if the O_DIRECT flag is set.  But now if we have multiple
threads racing to open files with O_DIRECT we'll still hit this
particular race.

The other, and I think better way, we could fix this is to have an
alternate destroy_never_used_workqueue() function which skips the
drain_workqueue().

What do people think?

The annotated lockdep is attached below.

						- Ted

[  263.210408] EXT4-fs (vdc): mounted filesystem without journal. Opts: acl,user_xattr,d
[  263.227896]
[  263.229118] ======================================================
[  263.231370] WARNING: possible circular locking dependency detected
[  263.232950] 4.19.0-rc6-xfstests-00021-g33458eaba4df #706 Not tainted
[  263.234762] ------------------------------------------------------
[  263.236915] kworker/1:3/1371 is trying to acquire lock:
[  263.238630] 00000000f97c1367 (&sb->s_type->i_mutex_key#11){+.+.}, at: __generic_file0
[  263.241115]
[  263.241115] but task is already holding lock:
[  263.243121] 000000008f473d52 ((work_completion)(&dio->complete_work)){+.+.}, at: pro0
[  263.246910]
[  263.246910] which lock already depends on the new lock.
[  263.246910]
[  263.249597]
[  263.249597] the existing dependency chain (in reverse order) is:
[  263.252024]
[  263.252024] -> #2 ((work_completion)(&dio->complete_work)){+.+.}:
[  263.254384] process_one_work (/arch/x86/include/asm/jump_label.h:36 /include/linux/jump_label.h:142 /include/trace/events/workqueue.h:89 /kernel/workqueue.c:2152) 
[  263.255835] worker_thread (/include/linux/compiler.h:188 /include/linux/list.h:203 /kernel/workqueue.c:2297) 
[  263.257180] kthread (/kernel/kthread.c:246) 
[  263.258310] ret_from_fork (/arch/x86/entry/entry_64.S:419) 
[  263.259449]
[  263.259449] -> #1 ((wq_completion)"dio/%s"sb->s_id){+.+.}:
[  263.260991] flush_workqueue (/kernel/workqueue.c:2655) 
[  263.262024] drain_workqueue (/kernel/workqueue.c:2822) 
[  263.263076] destroy_workqueue (/kernel/workqueue.c:4158) 
[  263.264177] sb_init_dio_done_wq (/fs/direct-io.c:636) 
[  263.265131] __blockdev_direct_IO (/fs/direct-io.c:1283 /fs/direct-io.c:1417) 
[  263.266248] ext4_direct_IO (/fs/ext4/inode.c:3778 /fs/ext4/inode.c:3901) 
[  263.267184] generic_file_direct_write (/mm/filemap.c:3057) 
[  263.268164] __generic_file_write_iter (/mm/filemap.c:3229) 
[  263.269173] ext4_file_write_iter (/include/linux/fs.h:743 /fs/ext4/file.c:267) 
[  263.270052] aio_write (/fs/aio.c:1473 /fs/aio.c:1561) 
[  263.270821] io_submit_one (/fs/aio.c:1836) 
[  263.272136] __se_sys_io_submit (/fs/aio.c:1916 /fs/aio.c:1887) 
[  263.273565] do_syscall_64 (/arch/x86/entry/common.c:290) 
[  263.274831] entry_SYSCALL_64_after_hwframe (/arch/x86/entry/entry_64.S:240) 
[  263.276514]
[  263.276514] -> #0 (&sb->s_type->i_mutex_key#11){+.+.}:
[  263.278492] lock_acquire (/arch/x86/include/asm/paravirt.h:788 /kernel/locking/lockdep.c:3903) 
[  263.279690] down_write (/arch/x86/include/asm/rwsem.h:160 /kernel/locking/rwsem.c:72) 
[  263.280812] __generic_file_fsync (/fs/libfs.c:982) 
[  263.282156] ext4_sync_file (/fs/ext4/fsync.c:121) 
[  263.283400] dio_complete (/include/linux/fs.h:2733 /fs/direct-io.c:329) 
[  263.284629] process_one_work (/arch/x86/include/asm/jump_label.h:36 /include/linux/jump_label.h:142 /include/trace/events/workqueue.h:114 /kernel/workqueue.c:2158) 
[  263.285916] worker_thread (/include/linux/compiler.h:188 /include/linux/list.h:203 /kernel/workqueue.c:2297) 
[  263.287114] kthread (/kernel/kthread.c:246) 
[  263.288198] ret_from_fork (/arch/x86/entry/entry_64.S:419) 
[  263.289394]
[  263.289394] other info that might help us debug this:
[  263.289394]
[  263.291805] Chain exists of:
[  263.291805]   &sb->s_type->i_mutex_key#11 --> (wq_completion)"dio/%s"sb->s_id --> (w)
[  263.291805]
[  263.294392]  Possible unsafe locking scenario:
[  263.294392]
[  263.295398]        CPU0                    CPU1
[  263.296171]        ----                    ----
[  263.296949]   lock((work_completion)(&dio->complete_work));
[  263.298481]                                lock((wq_completion)"dio/%s"sb->s_id);
[  263.300579]                                lock((work_completion)(&dio->complete_wor;
[  263.302856]   lock(&sb->s_type->i_mutex_key#11);
[  263.304175]
[  263.304175]  *** DEADLOCK ***
[  263.304175]
[  263.305861] 2 locks held by kworker/1:3/1371:
[  263.307097]  #0: 000000004a793a56 ((wq_completion)"dio/%s"sb->s_id){+.+.}, at: proce0
[  263.309761]  #1: 000000008f473d52 ((work_completion)(&dio->complete_work)){+.+.}, at0
[  263.312663]
[  263.312663] stack backtrace:
[  263.313881] CPU: 1 PID: 1371 Comm: kworker/1:3 Not tainted 4.19.0-rc6-xfstests-000216
[  263.316541] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.11.1-1 04/4
[  263.318894] Workqueue: dio/vdc dio_aio_complete_work
[  263.320303] Call Trace:
[  263.321038] dump_stack (/lib/dump_stack.c:115) 
[  263.322010] print_circular_bug.isra.19.cold.39 (/kernel/locking/lockdep.c:1221) 
[  263.323581] check_prev_add.constprop.27 (/kernel/locking/lockdep.c:1861) 
[  263.324991] __lock_acquire (/kernel/locking/lockdep.c:1975 /kernel/locking/lockdep.c:2415 /kernel/locking/lockdep.c:3411) 
[  263.326054] lock_acquire (/arch/x86/include/asm/paravirt.h:788 /kernel/locking/lockdep.c:3903) 
[  263.327086] ? __generic_file_fsync (/fs/libfs.c:982) 
[  263.328349] down_write (/arch/x86/include/asm/rwsem.h:160 /kernel/locking/rwsem.c:72) 
[  263.329300] ? __generic_file_fsync (/fs/libfs.c:982) 
[  263.330504] __generic_file_fsync (/fs/libfs.c:982) 
[  263.331663] ext4_sync_file (/fs/ext4/fsync.c:121) 
[  263.332722] dio_complete (/include/linux/fs.h:2733 /fs/direct-io.c:329) 
[  263.333772] process_one_work (/arch/x86/include/asm/jump_label.h:36 /include/linux/jump_label.h:142 /include/trace/events/workqueue.h:114 /kernel/workqueue.c:2158) 
[  263.334719] worker_thread (/include/linux/compiler.h:188 /include/linux/list.h:203 /kernel/workqueue.c:2297) 
[  263.335482] ? process_one_work (/kernel/workqueue.c:2239) 
[  263.336241] kthread (/kernel/kthread.c:246) 
[  263.336826] ? kthread_park (/kernel/kthread.c:206) 
[  263.337548] ret_from_fork (/arch/x86/entry/entry_64.S:419)

next prev parent reply	other threads:[~2018-10-23 22:52 UTC|newest]

Thread overview: 6+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-10-20 16:57 possible deadlock in flush_workqueue (2) syzbot
2018-10-23  9:42 ` syzbot
2018-10-23 14:28   ` Theodore Y. Ts'o [this message]
2018-10-23 19:40     ` Tejun Heo
2019-03-22 11:12 ` syzbot
2019-11-07 13:42 ` syzbot

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20181023142804.GC6850@thunk.org \
    --to=tytso@mit.edu \
    --cc=adilger.kernel@dilger.ca \
    --cc=jiangshanlai@gmail.com \
    --cc=linux-ext4@vger.kernel.org \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=syzbot+a50c7541a4a55cd49b02@syzkaller.appspotmail.com \
    --cc=syzkaller-bugs@googlegroups.com \
    --cc=tj@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.