From mboxrd@z Thu Jan 1 00:00:00 1970 From: Tony Battersby Subject: [PATCH] scsi-mq: fix potential deadlock in scsi_internal_device_unblock() Date: Thu, 05 Feb 2015 14:50:27 -0500 Message-ID: <54D3C983.4010806@cybernetics.com> Mime-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit Return-path: Received: from mail.cybernetics.com ([173.71.130.66]:58653 "EHLO mail.cybernetics.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752089AbbBETub (ORCPT ); Thu, 5 Feb 2015 14:50:31 -0500 Sender: linux-scsi-owner@vger.kernel.org List-Id: linux-scsi@vger.kernel.org To: linux-scsi@vger.kernel.org, Jens Axboe , Christoph Hellwig , "James E.J. Bottomley" A process context may acquire struct blk_mq_hw_ctx::lock without disabling IRQs. A deadlock may result if the process context holding the spinlock is interrupted by an IRQ that calls scsi_internal_device_unblock(), which may also try to acquire the same spinlock. Pass 'async = true' to blk_mq_start_stopped_hw_queues() to prevent the deadlock. This fixes a lockdep warning triggered by unplugging a SAS cable using mpt2sas: ================================= [ INFO: inconsistent lock state ] 3.19.0-rc7 #2 Not tainted --------------------------------- inconsistent {HARDIRQ-ON-W} -> {IN-HARDIRQ-W} usage. swapper/2/0 [HC1[1]:SC0[0]:HE0:SE1] takes: (&(&hctx->lock)->rlock){?.+...}, at: [] __blk_mq_run_hw_queue+0x183/0x3f0 {HARDIRQ-ON-W} state was registered at: [] __lock_acquire+0x721/0xc10 [] lock_acquire+0x5a/0x70 [] _raw_spin_lock+0x33/0x50 [] __blk_mq_run_hw_queue+0x24f/0x3f0 [] blk_mq_run_hw_queue+0x88/0xc0 [] blk_sq_make_request+0x15f/0x240 [] generic_make_request+0xc0/0x100 [] submit_bio+0x58/0x100 [] _submit_bh+0x117/0x150 [] submit_bh+0xb/0x10 [] block_read_full_page+0x268/0x370 [] blkdev_readpage+0x13/0x20 [] generic_file_read_iter+0x20b/0x640 [] blkdev_read_iter+0x32/0x40 [] new_sync_read+0x8a/0xc0 [] __vfs_read+0x13/0x60 [] vfs_read+0xa8/0x110 [] SyS_read+0x54/0xc0 [] system_call_fastpath+0x12/0x17 irq event stamp: 396654 hardirqs last enabled at (396651): [] cpuidle_enter_state+0x51/0xd0 hardirqs last disabled at (396652): [] common_interrupt+0x67/0x6c softirqs last enabled at (396654): [] _local_bh_enable+0x1c/0x50 softirqs last disabled at (396653): [] irq_enter+0x50/0x80 other info that might help us debug this: Possible unsafe locking scenario: CPU0 ---- lock(&(&hctx->lock)->rlock); lock(&(&hctx->lock)->rlock); *** DEADLOCK *** no locks held by swapper/2/0. stack backtrace: CPU: 2 PID: 0 Comm: swapper/2 Not tainted 3.19.0-rc7 #2 Hardware name: Supermicro X8DTH-i/6/iF/6F/X8DTH, BIOS 2.1b 05/04/12 ffffffff80fbf8b0 ffff88033e4439c8 ffffffff8067b708 0000000000000001 ffff88032fe9a1c0 ffff88033e443a18 ffffffff8029d40f 0000000000000000 ffffffff00000000 ffff880300000001 ffffffff80fbf8e8 ffff88032fe9a858 Call Trace: [] dump_stack+0x4f/0x6f [] print_usage_bug+0x23f/0x300 [] mark_lock+0x61d/0x690 [] __lock_acquire+0x76d/0xc10 [] ? enqueue_task_fair+0x1fc/0x890 [] ? resched_curr+0x89/0xc0 [] lock_acquire+0x5a/0x70 [] ? __blk_mq_run_hw_queue+0x183/0x3f0 [] _raw_spin_lock+0x33/0x50 [] ? __blk_mq_run_hw_queue+0x183/0x3f0 [] __blk_mq_run_hw_queue+0x183/0x3f0 [] blk_mq_run_hw_queue+0x88/0xc0 [] blk_mq_start_stopped_hw_queues+0x60/0x80 [] scsi_internal_device_unblock+0x46/0xb0 [] _scsih_ublock_io_device+0x7f/0xd0 [mpt2sas] [] _scsih_tm_tr_send+0x192/0x320 [mpt2sas] [] mpt2sas_scsih_event_callback+0x3b3/0x7b0 [mpt2sas] [] _base_interrupt+0x340/0x9d0 [mpt2sas] [] ? __lock_acquire+0x50c/0xc10 [] handle_irq_event_percpu+0x43/0x120 [] handle_irq_event+0x43/0x70 [] handle_edge_irq+0x9d/0x100 [] handle_irq+0x54/0x130 [] ? atomic_notifier_call_chain+0x11/0x20 [] do_IRQ+0x57/0x100 [] common_interrupt+0x6c/0x6c [] ? cpuidle_enter_state+0x5c/0xd0 [] ? cpuidle_enter_state+0x51/0xd0 [] cpuidle_enter+0x12/0x20 [] cpu_startup_entry+0x25f/0x300 [] start_secondary+0x13f/0x170 Cc: # 3.17+ Signed-off-by: Tony Battersby --- Note that this patch does *not* fix the deadlock with mptsas that I reported yesterday; that is a completely different issue that still needs to be addressed. --- linux-3.19-rc7/drivers/scsi/scsi_lib.c.orig 2015-02-01 23:07:21.000000000 -0500 +++ linux-3.19-rc7/drivers/scsi/scsi_lib.c 2015-02-05 13:28:12.000000000 -0500 @@ -3005,7 +3005,7 @@ scsi_internal_device_unblock(struct scsi return -EINVAL; if (q->mq_ops) { - blk_mq_start_stopped_hw_queues(q, false); + blk_mq_start_stopped_hw_queues(q, true); } else { spin_lock_irqsave(q->queue_lock, flags); blk_start_queue(q);