From mboxrd@z Thu Jan 1 00:00:00 1970 From: James Bottomley Subject: Re: [PATCH] Avoid that SCSI device removal through sysfs triggers a deadlock Date: Fri, 28 Oct 2016 19:08:56 -0700 Message-ID: <1477706936.2850.27.camel@linux.vnet.ibm.com> References: <14379fd1-c9bd-ad75-ca7c-0632f3e3c5d1@sandisk.com> Mime-Version: 1.0 Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: 7bit Return-path: Received: from mx0b-001b2d01.pphosted.com ([148.163.158.5]:47500 "EHLO mx0a-001b2d01.pphosted.com" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1754475AbcJ2CJR (ORCPT ); Fri, 28 Oct 2016 22:09:17 -0400 Received: from pps.filterd (m0098419.ppops.net [127.0.0.1]) by mx0b-001b2d01.pphosted.com (8.16.0.17/8.16.0.17) with SMTP id u9T28c87010593 for ; Fri, 28 Oct 2016 22:09:05 -0400 Received: from e37.co.us.ibm.com (e37.co.us.ibm.com [32.97.110.158]) by mx0b-001b2d01.pphosted.com with ESMTP id 26ccrpy3pv-1 (version=TLSv1.2 cipher=AES256-SHA bits=256 verify=NOT) for ; Fri, 28 Oct 2016 22:09:04 -0400 Received: from localhost by e37.co.us.ibm.com with IBM ESMTP SMTP Gateway: Authorized Use Only! Violators will be prosecuted for from ; Fri, 28 Oct 2016 20:09:04 -0600 In-Reply-To: <14379fd1-c9bd-ad75-ca7c-0632f3e3c5d1@sandisk.com> Sender: linux-scsi-owner@vger.kernel.org List-Id: linux-scsi@vger.kernel.org To: Bart Van Assche , "Martin K. Petersen" Cc: Hannes Reinecke , Johannes Thumshirn , Sagi Grimberg , "linux-scsi@vger.kernel.org" On Wed, 2016-10-26 at 11:44 -0700, Bart Van Assche wrote: > The solution I prefer is to modify the SCSI scanning code such that > the scan_mutex is only held while performing the actual LUN scanning > and while ensuring that no SCSI device has been created yet for a > certain LUN number but not while the Linux device and its sysfs > attributes are created. Since that approach would require extensive > changes in the SCSI scanning code, another approach has been chosen, > namely to make self-removal asynchronous. This patch avoids that > self-removal triggers the following deadlock: > > ====================================================== > [ INFO: possible circular locking dependency detected ] > 4.9.0-rc1-dbg+ #4 Not tainted > ------------------------------------------------------- > test_02_sdev_de/12586 is trying to acquire lock: > (&shost->scan_mutex){+.+.+.}, at: [] > scsi_remove_device+0x1e/0x40 > but task is already holding lock: > (s_active#336){++++.+}, at: [] > kernfs_remove_self+0xde/0x140 > which lock already depends on the new lock. > > the existing dependency chain (in reverse order) is: > -> #1 (s_active#336){++++.+}: > [] lock_acquire+0xe9/0x1d0 > [] __kernfs_remove+0x24a/0x310 > [] kernfs_remove_by_name_ns+0x40/0x90 > [] remove_files.isra.1+0x30/0x70 > [] sysfs_remove_group+0x3f/0x90 > [] sysfs_remove_groups+0x29/0x40 > [] device_remove_attrs+0x59/0x80 > [] device_del+0x125/0x240 > [] __scsi_remove_device+0x143/0x180 > [] scsi_forget_host+0x64/0x70 > [] scsi_remove_host+0x75/0x120 > [] 0xffffffffa035dbbb > [] process_one_work+0x1f5/0x690 > [] worker_thread+0x49/0x490 > [] kthread+0xeb/0x110 > [] ret_from_fork+0x27/0x40 > > -> #0 (&shost->scan_mutex){+.+.+.}: > [] __lock_acquire+0x10fc/0x1270 > [] lock_acquire+0xe9/0x1d0 > [] mutex_lock_nested+0x5f/0x360 > [] scsi_remove_device+0x1e/0x40 > [] sdev_store_delete+0x22/0x30 > [] dev_attr_store+0x13/0x20 > [] sysfs_kf_write+0x40/0x50 > [] kernfs_fop_write+0x137/0x1c0 > [] __vfs_write+0x23/0x140 > [] vfs_write+0xb0/0x190 > [] SyS_write+0x44/0xa0 > [] entry_SYSCALL_64_fastpath+0x18/0xad > > other info that might help us debug this: > > Possible unsafe locking scenario: > CPU0 CPU1 > ---- ---- > lock(s_active#336); > lock(&shost->scan_mutex); > lock(s_active#336); > lock(&shost->scan_mutex); > > *** DEADLOCK *** This is a deadlock caused by an inversion issue in kernfs (suicide vs non-suicide removes); so fixing it in SCSI alone really isn't appropriate. I count at least five other subsystems all using this mechanism, so they'll all be similarly affected. It looks to be fairly simply fixable inside kernfs, so please fix it that way. Thanks, James