Re: [RFC] scsi: reduce protection of scan_mutex in scsi_remove_device

All of lore.kernel.org
 help / color / mirror / Atom feed

From: Bart Van Assche <Bart.VanAssche@sandisk.com>
To: "linux-scsi@vger.kernel.org" <linux-scsi@vger.kernel.org>,
	"songliubraving@fb.com" <songliubraving@fb.com>
Cc: "hch@infradead.org" <hch@infradead.org>
Subject: Re: [RFC] scsi: reduce protection of scan_mutex in scsi_remove_device
Date: Fri, 21 Apr 2017 21:17:32 +0000	[thread overview]
Message-ID: <1492809452.2499.2.camel@sandisk.com> (raw)
In-Reply-To: <20170421211302.2667649-1-songliubraving@fb.com>

On Fri, 2017-04-21 at 14:13 -0700, Song Liu wrote:
> When a device is deleted through sysfs handle "delete", [ ... ]

If I try to use that sysfs attribute then I encounter a deadlock (see
also the report below). How is it possible that you have not hit that
deadlock in your tests?

Bart.

======================================================
[ INFO: possible circular locking dependency detected ]
4.11.0-rc6-dbg+ #3 Tainted: G          I    
-------------------------------------------------------
bash/7858 is trying to acquire lock:
 (&shost->scan_mutex){+.+.+.}, at: [<ffffffff814de090>] scsi_remove_device+0x20/0x40

but task is already holding lock:
 (s_active#326){++++.+}, at: [<ffffffff81293e20>] kernfs_remove_self+0xe0/0x140

which lock already depends on the new lock.


the existing dependency chain (in reverse order) is:

-> #1 (s_active#326){++++.+}:
       lock_acquire+0xd5/0x1c0
       __kernfs_remove+0x248/0x310
       kernfs_remove_by_name_ns+0x45/0xa0
       remove_files.isra.1+0x35/0x70
       sysfs_remove_group+0x44/0x90
       sysfs_remove_groups+0x2e/0x50
       device_remove_attrs+0x5e/0x80
       device_del+0x1fd/0x320
       __scsi_remove_device+0xe9/0x120
       scsi_forget_host+0x60/0x70
       scsi_remove_host+0x71/0x110
       0xffffffffa0703690
       process_one_work+0x20b/0x6a0
       worker_thread+0x4e/0x4a0
       kthread+0x113/0x150
       ret_from_fork+0x2e/0x40

-> #0 (&shost->scan_mutex){+.+.+.}:
       __lock_acquire+0x1109/0x1280
       lock_acquire+0xd5/0x1c0
       __mutex_lock+0x83/0x980
       mutex_lock_nested+0x1b/0x20
       scsi_remove_device+0x20/0x40
       sdev_store_delete+0x27/0x30
       dev_attr_store+0x18/0x30
       sysfs_kf_write+0x45/0x60
       kernfs_fop_write+0x13c/0x1c0
       __vfs_write+0x28/0x140
       vfs_write+0xc8/0x1e0
       SyS_write+0x49/0xa0
       entry_SYSCALL_64_fastpath+0x18/0xad

other info that might help us debug this:

 Possible unsafe locking scenario:

       CPU0                    CPU1
       ----                    ----
  lock(s_active#326);
                               lock(&shost->scan_mutex);
                               lock(s_active#326);
  lock(&shost->scan_mutex);

 *** DEADLOCK ***

3 locks held by bash/7858:
 #0:  (sb_writers#4){.+.+.+}, at: [<ffffffff81206b45>] vfs_write+0x195/0x1e0
 #1:  (&of->mutex){+.+.+.}, at: [<ffffffff81294af6>] kernfs_fop_write+0x106/0x1c0
 #2:  (s_active#326){++++.+}, at: [<ffffffff81293e20>] kernfs_remove_self+0xe0/0x140

stack backtrace:
CPU: 3 PID: 7858 Comm: bash Tainted: G          I     4.11.0-rc6-dbg+ #3
Hardware name: Dell Inc. PowerEdge R430/03XKDV, BIOS 1.0.2 11/17/2014
Call Trace:
 dump_stack+0x68/0x93
 print_circular_bug+0x1be/0x210
 __lock_acquire+0x1109/0x1280
 lock_acquire+0xd5/0x1c0
 __mutex_lock+0x83/0x980
 mutex_lock_nested+0x1b/0x20
 scsi_remove_device+0x20/0x40
 sdev_store_delete+0x27/0x30
 dev_attr_store+0x18/0x30
 sysfs_kf_write+0x45/0x60
 kernfs_fop_write+0x13c/0x1c0
 __vfs_write+0x28/0x140
 vfs_write+0xc8/0x1e0
 SyS_write+0x49/0xa0
 entry_SYSCALL_64_fastpath+0x18/0xad
RIP: 0033:0x7fec0f748500
RSP: 002b:00007ffc1ddaec98 EFLAGS: 00000246 ORIG_RAX: 0000000000000001
RAX: ffffffffffffffda RBX: 0000000000000046 RCX: 00007fec0f748500
RDX: 0000000000000002 RSI: 0000000002012aa0 RDI: 0000000000000001
RBP: 00007ffc1ddaebf0 R08: 00007fec0fa0a740 R09: 00007fec10061100
R10: 0000000000000098 R11: 0000000000000246 R12: 00007fec10084bf0
R13: 0000000000000001 R14: 0000000000000000 R15: 00007ffc1ddaec18

next prev parent reply	other threads:[~2017-04-21 21:17 UTC|newest]

Thread overview: 14+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-04-21 21:13 [RFC] scsi: reduce protection of scan_mutex in scsi_remove_device Song Liu
2017-04-21 21:17 ` Bart Van Assche [this message]
2017-04-21 22:20   ` Song Liu
2017-04-21 21:20 ` Bart Van Assche
2017-04-21 22:31   ` Song Liu
2017-04-25 20:59     ` Bart Van Assche
2017-04-25 21:29       ` Song Liu
2017-04-24 15:28 ` Christoph Hellwig
2017-04-25 17:23 ` Bart Van Assche
2017-04-25 17:42   ` Song Liu
2017-04-25 17:52     ` Bart Van Assche
2017-04-25 21:17       ` Song Liu
2017-04-25 22:17         ` Bart Van Assche
2017-04-26  0:41           ` Song Liu

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1492809452.2499.2.camel@sandisk.com \
    --to=bart.vanassche@sandisk.com \
    --cc=hch@infradead.org \
    --cc=linux-scsi@vger.kernel.org \
    --cc=songliubraving@fb.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.