From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-10.1 required=3.0 tests=DKIMWL_WL_HIGH,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,INCLUDES_PATCH,MAILING_LIST_MULTI,SIGNED_OFF_BY, SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED,USER_AGENT_GIT autolearn=unavailable autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id BC18EC38A29 for ; Sat, 18 Apr 2020 14:11:20 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 9574B21D6C for ; Sat, 18 Apr 2020 14:11:20 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=default; t=1587219080; bh=YPVn8XiBpbhF4Hy4OYtYupNTHjRZLI+/FXRVxDRS6NA=; h=From:To:Cc:Subject:Date:In-Reply-To:References:List-ID:From; b=oZ2Zj4UA30JvmOV54W6/myLBaJY7BVych8DiPkSFDT0FTIFv9HYkPLvUgTcT6lGkJ ZtXbr1s2kq1MIUFEo8fiBsZgsVAY9cR2meaApWEmWObapoPOVlSP6WOHF9oXDg/ja8 1Ma1zTIc36E1gRLXuW7G66wrKJiN8MhIbrMAIDFU= Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1727868AbgDROKd (ORCPT ); Sat, 18 Apr 2020 10:10:33 -0400 Received: from mail.kernel.org ([198.145.29.99]:38734 "EHLO mail.kernel.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1727847AbgDROKb (ORCPT ); Sat, 18 Apr 2020 10:10:31 -0400 Received: from sasha-vm.mshome.net (c-73-47-72-35.hsd1.nh.comcast.net [73.47.72.35]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPSA id 2E28822250; Sat, 18 Apr 2020 14:10:30 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=default; t=1587219030; bh=YPVn8XiBpbhF4Hy4OYtYupNTHjRZLI+/FXRVxDRS6NA=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=E3fM3YtIs/C5Wv/1yLHb8717dpYebD3CNNUHzgEKDI0i2a4YlwQjJTHjoUQUORm73 xZAejkVXOOoXRPiOTcyq4V3quOJw1CpygKgAdxTtwdvw9zUn9wxn78kZz80H7Y4PmL NuKkhfSsfZ8cnXDXrXbnH0TmXTFPBbp7KNcqIFQg= From: Sasha Levin To: linux-kernel@vger.kernel.org, stable@vger.kernel.org Cc: Sreekanth Reddy , "Martin K . Petersen" , Sasha Levin , MPT-FusionLinux.pdl@broadcom.com, linux-scsi@vger.kernel.org Subject: [PATCH AUTOSEL 5.5 65/75] scsi: mpt3sas: Fix kernel panic observed on soft HBA unplug Date: Sat, 18 Apr 2020 10:09:00 -0400 Message-Id: <20200418140910.8280-65-sashal@kernel.org> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20200418140910.8280-1-sashal@kernel.org> References: <20200418140910.8280-1-sashal@kernel.org> MIME-Version: 1.0 X-stable: review X-Patchwork-Hint: Ignore Content-Transfer-Encoding: 8bit Sender: stable-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: stable@vger.kernel.org From: Sreekanth Reddy [ Upstream commit cc41f11a21a51d6869d71e525a7264c748d7c0d7 ] Generic protection fault type kernel panic is observed when user performs soft (ordered) HBA unplug operation while IOs are running on drives connected to HBA. When user performs ordered HBA removal operation, the kernel calls PCI device's .remove() call back function where driver is flushing out all the outstanding SCSI IO commands with DID_NO_CONNECT host byte and also unmaps sg buffers allocated for these IO commands. However, in the ordered HBA removal case (unlike of real HBA hot removal), HBA device is still alive and hence HBA hardware is performing the DMA operations to those buffers on the system memory which are already unmapped while flushing out the outstanding SCSI IO commands and this leads to kernel panic. Don't flush out the outstanding IOs from .remove() path in case of ordered removal since HBA will be still alive in this case and it can complete the outstanding IOs. Flush out the outstanding IOs only in case of 'physical HBA hot unplug' where there won't be any communication with the HBA. During shutdown also it is possible that HBA hardware can perform DMA operations on those outstanding IO buffers which are completed with DID_NO_CONNECT by the driver from .shutdown(). So same above fix is applied in shutdown path as well. It is safe to drop the outstanding commands when HBA is inaccessible such as when permanent PCI failure happens, when HBA is in non-operational state, or when someone does a real HBA hot unplug operation. Since driver knows that HBA is inaccessible during these cases, it is safe to drop the outstanding commands instead of waiting for SCSI error recovery to kick in and clear these outstanding commands. Link: https://lore.kernel.org/r/1585302763-23007-1-git-send-email-sreekanth.reddy@broadcom.com Fixes: c666d3be99c0 ("scsi: mpt3sas: wait for and flush running commands on shutdown/unload") Cc: stable@vger.kernel.org #v4.14.174+ Signed-off-by: Sreekanth Reddy Signed-off-by: Martin K. Petersen Signed-off-by: Sasha Levin --- drivers/scsi/mpt3sas/mpt3sas_scsih.c | 8 ++++---- 1 file changed, 4 insertions(+), 4 deletions(-) diff --git a/drivers/scsi/mpt3sas/mpt3sas_scsih.c b/drivers/scsi/mpt3sas/mpt3sas_scsih.c index a038be8a0e905..c919cb7ad5248 100644 --- a/drivers/scsi/mpt3sas/mpt3sas_scsih.c +++ b/drivers/scsi/mpt3sas/mpt3sas_scsih.c @@ -9747,8 +9747,8 @@ static void scsih_remove(struct pci_dev *pdev) ioc->remove_host = 1; - mpt3sas_wait_for_commands_to_complete(ioc); - _scsih_flush_running_cmds(ioc); + if (!pci_device_is_present(pdev)) + _scsih_flush_running_cmds(ioc); _scsih_fw_event_cleanup_queue(ioc); @@ -9831,8 +9831,8 @@ scsih_shutdown(struct pci_dev *pdev) ioc->remove_host = 1; - mpt3sas_wait_for_commands_to_complete(ioc); - _scsih_flush_running_cmds(ioc); + if (!pci_device_is_present(pdev)) + _scsih_flush_running_cmds(ioc); _scsih_fw_event_cleanup_queue(ioc); -- 2.20.1