From: Laurence Oberman <loberman@redhat.com>
To: Bart Van Assche <bart.vanassche@sandisk.com>
Cc: James Bottomley <james.bottomley@hansenpartnership.com>,
"Martin K. Petersen" <martin.petersen@oracle.com>,
Hannes Reinecke <hare@suse.de>,
Christoph Hellwig <hch@infradead.org>,
linux-scsi@vger.kernel.org
Subject: Re: [PATCH 2/2] scsi_dh_alua: Fix a recently introduced deadlock
Date: Mon, 28 Mar 2016 14:21:04 -0400 (EDT) [thread overview]
Message-ID: <933006845.25395115.1459189264684.JavaMail.zimbra@redhat.com> (raw)
In-Reply-To: <56F9746C.40101@sandisk.com>
In similar testing last month with mlx4_ib I was able to generate the same lockup so this patch will help.
If I get a chance will add a Tested-by this week.
Reviewed-by: Laurence Oberman <loberman@redhat.com>
Laurence Oberman
Principal Software Maintenance Engineer
Red Hat Global Support Services
----- Original Message -----
From: "Bart Van Assche" <bart.vanassche@sandisk.com>
To: "James Bottomley" <james.bottomley@hansenpartnership.com>, "Martin K. Petersen" <martin.petersen@oracle.com>
Cc: "Hannes Reinecke" <hare@suse.de>, "Christoph Hellwig" <hch@infradead.org>, linux-scsi@vger.kernel.org
Sent: Monday, March 28, 2016 2:14:04 PM
Subject: [PATCH 2/2] scsi_dh_alua: Fix a recently introduced deadlock
While retesting the SRP initiator I ran the command "rmmod mlx4_ib"
while I/O was in progress. That command triggers SCSI device removal
indirectly. Avoid that this action triggers the following deadlock:
=================================
[ INFO: inconsistent lock state ]
4.6.0-rc0-dbg+ #2 Tainted: G O
---------------------------------
inconsistent {IN-SOFTIRQ-W} -> {SOFTIRQ-ON-W} usage.
multipathd/484 [HC0[0]:SC0[0]:HE1:SE1] takes:
(&(&pg->lock)->rlock){+.?...}, at: [<ffffffffa04f50a2>] alua_bus_detach+0x52/0xa0 [scsi_dh_alua]
{IN-SOFTIRQ-W} state was registered at:
[<ffffffff810a64a9>] __lock_acquire+0x7e9/0x1ad0
[<ffffffff810a7fd0>] lock_acquire+0x60/0x80
[<ffffffff8159910e>] _raw_spin_lock_irqsave+0x3e/0x60
[<ffffffffa04f5131>] alua_rtpg_queue+0x41/0x1d0 [scsi_dh_alua]
[<ffffffffa04f5531>] alua_check+0xe1/0x220 [scsi_dh_alua]
[<ffffffffa04f5709>] alua_check_sense+0x99/0xb0 [scsi_dh_alua]
[<ffffffff813f0d01>] scsi_check_sense+0x71/0x3f0
[<ffffffff813f2f8b>] scsi_decide_disposition+0x18b/0x1d0
[<ffffffff813f6e52>] scsi_softirq_done+0x52/0x140
[<ffffffff812a26f2>] blk_done_softirq+0x52/0x90
[<ffffffff8105bc1f>] __do_softirq+0x10f/0x230
[<ffffffff8105bec8>] irq_exit+0xa8/0xb0
[<ffffffff8101a675>] do_IRQ+0x65/0x110
[<ffffffff8159a2c9>] ret_from_intr+0x0/0x19
[<ffffffff811732f1>] kmem_cache_alloc+0x151/0x190
[<ffffffff8118e534>] create_object+0x34/0x2d0
[<ffffffff8158eaa6>] kmemleak_alloc_percpu+0x56/0xd0
[<ffffffff8113ab0d>] pcpu_alloc+0x38d/0x660
[<ffffffff8113aded>] __alloc_percpu_gfp+0xd/0x10
[<ffffffff812e56a5>] __percpu_counter_init+0x55/0xb0
[<ffffffff812b4989>] blkg_alloc+0x79/0x230
[<ffffffff812b6756>] blkcg_init_queue+0x26/0x1d0
[<ffffffff81297eed>] blk_alloc_queue_node+0x27d/0x2e0
[<ffffffffa017766c>] dm_create+0x20c/0x570 [dm_mod]
[<ffffffffa017e356>] dev_create+0x56/0x2c0 [dm_mod]
[<ffffffffa017dcae>] ctl_ioctl+0x26e/0x520 [dm_mod]
[<ffffffffa017df6e>] dm_ctl_ioctl+0xe/0x20 [dm_mod]
[<ffffffff811aa8ee>] do_vfs_ioctl+0x8e/0x660
[<ffffffff811aaefc>] SyS_ioctl+0x3c/0x70
[<ffffffff81599929>] entry_SYSCALL_64_fastpath+0x1c/0xac
irq event stamp: 4290931
hardirqs last enabled at (4290931): [ 1662.892772]
[<ffffffff81599341>] _raw_spin_unlock_irqrestore+0x31/0x50
hardirqs last disabled at (4290930): [<ffffffff815990e7>] _raw_spin_lock_irqsave+0x17/0x60
softirqs last enabled at (4290774): [<ffffffff8105bcdb>] __do_softirq+0x1cb/0x230
softirqs last disabled at (4289831): [<ffffffff8105bec8>] irq_exit+0xa8/0xb0
other info that might help us debug this:
Possible unsafe locking scenario:
CPU0
----
lock(&(&pg->lock)->rlock);
<Interrupt>
lock(&(&pg->lock)->rlock);
*** DEADLOCK ***
2 locks held by multipathd/484:
#0: (&bdev->bd_mutex){+.+.+.}, at: [<ffffffff811d1cc3>] __blkdev_put+0x33/0x360
#1: (sd_ref_mutex){+.+...}, at: [<ffffffff81400afc>] scsi_disk_put+0x1c/0x40
stack backtrace:
CPU: 6 PID: 484 Comm: multipathd Tainted: G O 4.6.0-rc0-dbg+ #2
Call Trace:
[<ffffffff812bd115>] dump_stack+0x67/0x92
[<ffffffff810a5175>] print_usage_bug+0x215/0x240
[<ffffffff810a56ea>] mark_lock+0x54a/0x610
[<ffffffff810a6505>] __lock_acquire+0x845/0x1ad0
[<ffffffff810a7fd0>] lock_acquire+0x60/0x80
[<ffffffff81598f23>] _raw_spin_lock+0x33/0x50
[<ffffffffa04f50a2>] alua_bus_detach+0x52/0xa0 [scsi_dh_alua]
[<ffffffff813ff6f7>] scsi_dh_release_device+0x17/0x50
[<ffffffff813fb8da>] scsi_device_dev_release_usercontext+0x2a/0x120
[<ffffffff810701f0>] execute_in_process_context+0x80/0x90
[<ffffffff813fb8a7>] scsi_device_dev_release+0x17/0x20
[<ffffffff813c8cfd>] device_release+0x2d/0x90
[<ffffffff812bfa8a>] kobject_release+0x7a/0x190
[<ffffffff812bf946>] kobject_put+0x26/0x50
[<ffffffff813c8ee2>] put_device+0x12/0x20
[<ffffffff813edc86>] scsi_device_put+0x26/0x30
[<ffffffff81400b0d>] scsi_disk_put+0x2d/0x40
[<ffffffff81400b68>] sd_release+0x48/0xb0
[<ffffffff811d1f2e>] __blkdev_put+0x29e/0x360
[<ffffffff811d24b9>] blkdev_put+0x49/0x170
[<ffffffff811d2600>] blkdev_close+0x20/0x30
[<ffffffff81198f48>] __fput+0xe8/0x1f0
[<ffffffff81199089>] ____fput+0x9/0x10
[<ffffffff81075d9e>] task_work_run+0x6e/0xa0
[<ffffffff81001119>] exit_to_usermode_loop+0xa9/0xb0
[<ffffffff81001590>] syscall_return_slowpath+0xb0/0xc0
[<ffffffff815999b7>] entry_SYSCALL_64_fastpath+0xaa/0xac
Fixes: cb0a168cb6b8 (scsi_dh_alua: update 'access_state' field)
Signed-off-by: Bart Van Assche <bart.vanassche@sandisk.com>
Cc: Hannes Reinecke <hare@suse.de>
Signed-off-by: Bart Van Assche <bart.vanassche@sandisk.com>
---
drivers/scsi/device_handler/scsi_dh_alua.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/drivers/scsi/device_handler/scsi_dh_alua.c b/drivers/scsi/device_handler/scsi_dh_alua.c
index a404a41..8eaed05 100644
--- a/drivers/scsi/device_handler/scsi_dh_alua.c
+++ b/drivers/scsi/device_handler/scsi_dh_alua.c
@@ -1112,9 +1112,9 @@ static void alua_bus_detach(struct scsi_device *sdev)
h->sdev = NULL;
spin_unlock(&h->pg_lock);
if (pg) {
- spin_lock(&pg->lock);
+ spin_lock_irq(&pg->lock);
list_del_rcu(&h->node);
- spin_unlock(&pg->lock);
+ spin_unlock_irq(&pg->lock);
kref_put(&pg->kref, release_port_group);
}
sdev->handler_data = NULL;
--
2.7.3
--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
next prev parent reply other threads:[~2016-03-28 18:21 UTC|newest]
Thread overview: 11+ messages / expand[flat|nested] mbox.gz Atom feed top
2016-03-28 18:12 [PATCH 0/2] scsi_dh_alua fixes for kernel v4.6 Bart Van Assche
2016-03-28 18:13 ` [PATCH 1/2] Declare local symbols static Bart Van Assche
2016-03-29 6:41 ` Hannes Reinecke
2016-03-29 7:33 ` Christoph Hellwig
2016-03-29 14:55 ` Ewan D. Milne
2016-03-28 18:14 ` [PATCH 2/2] scsi_dh_alua: Fix a recently introduced deadlock Bart Van Assche
2016-03-28 18:21 ` Laurence Oberman [this message]
2016-03-29 6:41 ` Hannes Reinecke
2016-03-29 7:34 ` Christoph Hellwig
2016-03-29 14:55 ` Ewan D. Milne
2016-03-30 0:33 ` [PATCH 0/2] scsi_dh_alua fixes for kernel v4.6 Martin K. Petersen
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=933006845.25395115.1459189264684.JavaMail.zimbra@redhat.com \
--to=loberman@redhat.com \
--cc=bart.vanassche@sandisk.com \
--cc=hare@suse.de \
--cc=hch@infradead.org \
--cc=james.bottomley@hansenpartnership.com \
--cc=linux-scsi@vger.kernel.org \
--cc=martin.petersen@oracle.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.