Re: [RFC] scsi: reduce protection of scan_mutex in scsi_remove_device

public inbox for linux-scsi@vger.kernel.org
 help / color / mirror / Atom feed

From: Bart Van Assche <Bart.VanAssche@sandisk.com>
To: "linux-scsi@vger.kernel.org" <linux-scsi@vger.kernel.org>,
	"songliubraving@fb.com" <songliubraving@fb.com>
Cc: "hch@infradead.org" <hch@infradead.org>
Subject: Re: [RFC] scsi: reduce protection of scan_mutex in scsi_remove_device
Date: Fri, 21 Apr 2017 21:17:32 +0000	[thread overview]
Message-ID: <1492809452.2499.2.camel@sandisk.com> (raw)
In-Reply-To: <20170421211302.2667649-1-songliubraving@fb.com>

On Fri, 2017-04-21 at 14:13 -0700, Song Liu wrote:
> When a device is deleted through sysfs handle "delete", [ ... ]

If I try to use that sysfs attribute then I encounter a deadlock (see
also the report below). How is it possible that you have not hit that
deadlock in your tests?

Bart.

======================================================
[ INFO: possible circular locking dependency detected ]
4.11.0-rc6-dbg+ #3 Tainted: G          I    
-------------------------------------------------------
bash/7858 is trying to acquire lock:
 (&shost->scan_mutex){+.+.+.}, at: [<ffffffff814de090>] scsi_remove_device+0x20/0x40

but task is already holding lock:
 (s_active#326){++++.+}, at: [<ffffffff81293e20>] kernfs_remove_self+0xe0/0x140

which lock already depends on the new lock.


the existing dependency chain (in reverse order) is:

-> #1 (s_active#326){++++.+}:
       lock_acquire+0xd5/0x1c0
       __kernfs_remove+0x248/0x310
       kernfs_remove_by_name_ns+0x45/0xa0
       remove_files.isra.1+0x35/0x70
       sysfs_remove_group+0x44/0x90
       sysfs_remove_groups+0x2e/0x50
       device_remove_attrs+0x5e/0x80
       device_del+0x1fd/0x320
       __scsi_remove_device+0xe9/0x120
       scsi_forget_host+0x60/0x70
       scsi_remove_host+0x71/0x110
       0xffffffffa0703690
       process_one_work+0x20b/0x6a0
       worker_thread+0x4e/0x4a0
       kthread+0x113/0x150
       ret_from_fork+0x2e/0x40

-> #0 (&shost->scan_mutex){+.+.+.}:
       __lock_acquire+0x1109/0x1280
       lock_acquire+0xd5/0x1c0
       __mutex_lock+0x83/0x980
       mutex_lock_nested+0x1b/0x20
       scsi_remove_device+0x20/0x40
       sdev_store_delete+0x27/0x30
       dev_attr_store+0x18/0x30
       sysfs_kf_write+0x45/0x60
       kernfs_fop_write+0x13c/0x1c0
       __vfs_write+0x28/0x140
       vfs_write+0xc8/0x1e0
       SyS_write+0x49/0xa0
       entry_SYSCALL_64_fastpath+0x18/0xad

other info that might help us debug this:

 Possible unsafe locking scenario:

       CPU0                    CPU1
       ----                    ----
  lock(s_active#326);
                               lock(&shost->scan_mutex);
                               lock(s_active#326);
  lock(&shost->scan_mutex);

 *** DEADLOCK ***

3 locks held by bash/7858:
 #0:  (sb_writers#4){.+.+.+}, at: [<ffffffff81206b45>] vfs_write+0x195/0x1e0
 #1:  (&of->mutex){+.+.+.}, at: [<ffffffff81294af6>] kernfs_fop_write+0x106/0x1c0
 #2:  (s_active#326){++++.+}, at: [<ffffffff81293e20>] kernfs_remove_self+0xe0/0x140

stack backtrace:
CPU: 3 PID: 7858 Comm: bash Tainted: G          I     4.11.0-rc6-dbg+ #3
Hardware name: Dell Inc. PowerEdge R430/03XKDV, BIOS 1.0.2 11/17/2014
Call Trace:
 dump_stack+0x68/0x93
 print_circular_bug+0x1be/0x210
 __lock_acquire+0x1109/0x1280
 lock_acquire+0xd5/0x1c0
 __mutex_lock+0x83/0x980
 mutex_lock_nested+0x1b/0x20
 scsi_remove_device+0x20/0x40
 sdev_store_delete+0x27/0x30
 dev_attr_store+0x18/0x30
 sysfs_kf_write+0x45/0x60
 kernfs_fop_write+0x13c/0x1c0
 __vfs_write+0x28/0x140
 vfs_write+0xc8/0x1e0
 SyS_write+0x49/0xa0
 entry_SYSCALL_64_fastpath+0x18/0xad
RIP: 0033:0x7fec0f748500
RSP: 002b:00007ffc1ddaec98 EFLAGS: 00000246 ORIG_RAX: 0000000000000001
RAX: ffffffffffffffda RBX: 0000000000000046 RCX: 00007fec0f748500
RDX: 0000000000000002 RSI: 0000000002012aa0 RDI: 0000000000000001
RBP: 00007ffc1ddaebf0 R08: 00007fec0fa0a740 R09: 00007fec10061100
R10: 0000000000000098 R11: 0000000000000246 R12: 00007fec10084bf0
R13: 0000000000000001 R14: 0000000000000000 R15: 00007ffc1ddaec18

next prev parent reply	other threads:[~2017-04-21 21:17 UTC|newest]

Thread overview: 14+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-04-21 21:13 [RFC] scsi: reduce protection of scan_mutex in scsi_remove_device Song Liu
2017-04-21 21:17 ` Bart Van Assche [this message]
2017-04-21 22:20   ` Song Liu
2017-04-21 21:20 ` Bart Van Assche
2017-04-21 22:31   ` Song Liu
2017-04-25 20:59     ` Bart Van Assche
2017-04-25 21:29       ` Song Liu
2017-04-24 15:28 ` Christoph Hellwig
2017-04-25 17:23 ` Bart Van Assche
2017-04-25 17:42   ` Song Liu
2017-04-25 17:52     ` Bart Van Assche
2017-04-25 21:17       ` Song Liu
2017-04-25 22:17         ` Bart Van Assche
2017-04-26  0:41           ` Song Liu

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1492809452.2499.2.camel@sandisk.com \
    --to=bart.vanassche@sandisk.com \
    --cc=hch@infradead.org \
    --cc=linux-scsi@vger.kernel.org \
    --cc=songliubraving@fb.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox