From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-9.0 required=3.0 tests=DKIMWL_WL_HIGH,DKIM_SIGNED, DKIM_VALID,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_PATCH,MAILING_LIST_MULTI, SIGNED_OFF_BY,SPF_PASS,URIBL_BLOCKED,USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id B30B0C4360F for ; Thu, 4 Apr 2019 09:00:08 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id 71B6A2184E for ; Thu, 4 Apr 2019 09:00:08 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=default; t=1554368408; bh=So8D3m/GzPHLJY4Cf6N/XuYD0lN1CmjdlglUaX4ZTQo=; h=From:To:Cc:Subject:Date:In-Reply-To:References:List-ID:From; b=Pvs5Z1IC+U1Q44j18oUGEFkHnUSW8OxrrFtBKVst1uON1sNIXaP4F9Q9uWLH+z0UB GkKyLAVWOTRV/o+tTL8ZjuvnuxwoVZ2yCwmyvCaw+BDiUlzJyqqsWIXNZ6nrfGvidY YQWdIJ+3qkzkSXvT53pZg8MVURdjS8DQ5e3aNDRM= Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1731203AbfDDJAG (ORCPT ); Thu, 4 Apr 2019 05:00:06 -0400 Received: from mail.kernel.org ([198.145.29.99]:36616 "EHLO mail.kernel.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726790AbfDDJAF (ORCPT ); Thu, 4 Apr 2019 05:00:05 -0400 Received: from localhost (83-86-89-107.cable.dynamic.v4.ziggo.nl [83.86.89.107]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPSA id 7898D21855; Thu, 4 Apr 2019 09:00:03 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=default; t=1554368404; bh=So8D3m/GzPHLJY4Cf6N/XuYD0lN1CmjdlglUaX4ZTQo=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=tzyd2nwxLi4L7ZLPjxm/izRVH7kZUcvNpeevpAREF1hEnSGCjM1Y32PUQFDYvcXH9 fRATkbVo3tfJ/aYCrOe6sRikeUCRxb6mfSOvcgI7ac7uUV4RYCNV/TJH3TGf8enDcf jsNWvQucE3SRDT/YnP15cziNa+NJ/0oddbslVe70= From: Greg Kroah-Hartman To: linux-kernel@vger.kernel.org Cc: Greg Kroah-Hartman , stable@vger.kernel.org, Xiang Chen , John Garry , "Martin K. Petersen" , Sasha Levin Subject: [PATCH 4.19 019/187] scsi: hisi_sas: Fix a timeout race of driver internal and SMP IO Date: Thu, 4 Apr 2019 10:45:56 +0200 Message-Id: <20190404084604.001760444@linuxfoundation.org> X-Mailer: git-send-email 2.21.0 In-Reply-To: <20190404084603.119654039@linuxfoundation.org> References: <20190404084603.119654039@linuxfoundation.org> User-Agent: quilt/0.65 X-stable: review X-Patchwork-Hint: ignore MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org 4.19-stable review patch. If anyone has any objections, please let me know. ------------------ [ Upstream commit 4790595723d4b833b18c994973d39f9efb842887 ] For internal IO and SMP IO, there is a time-out timer for them. In the timer handler, it checks whether IO is done according to the flag task->task_state_lock. There is an issue which may cause system suspended: internal IO or SMP IO is sent, but at that time because of hardware exception (such as inject 2Bit ECC error), so IO is not completed and also not timeout. But, at that time, the SAS controller reset occurs to recover system. It will release the resource and set the status of IO to be SAS_TASK_STATE_DONE, so when IO timeout, it will never complete the completion of IO and wait for ever. [ 729.123632] Call trace: [ 729.126791] [] __switch_to+0x94/0xa8 [ 729.133106] [] __schedule+0x1e8/0x7fc [ 729.138975] [] schedule+0x34/0x8c [ 729.144401] [] schedule_timeout+0x1d8/0x3cc [ 729.150690] [] wait_for_common+0xdc/0x1a0 [ 729.157101] [] wait_for_completion+0x28/0x34 [ 729.165973] [] hisi_sas_internal_task_abort+0x2a0/0x424 [hisi_sas_test_main] [ 729.176447] [] hisi_sas_abort_task+0x244/0x2d8 [hisi_sas_test_main] [ 729.185258] [] sas_eh_handle_sas_errors+0x1c8/0x7b8 [ 729.192391] [] sas_scsi_recover_host+0x130/0x398 [ 729.199237] [] scsi_error_handler+0x148/0x5c0 [ 729.206009] [] kthread+0x10c/0x138 [ 729.211563] [] ret_from_fork+0x10/0x18 To solve the issue, callback function task_done of those IOs need to be called when on SAS controller reset. Signed-off-by: Xiang Chen Signed-off-by: John Garry Signed-off-by: Martin K. Petersen Signed-off-by: Sasha Levin --- drivers/scsi/hisi_sas/hisi_sas_main.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/drivers/scsi/hisi_sas/hisi_sas_main.c b/drivers/scsi/hisi_sas/hisi_sas_main.c index c25f3a9b0b9f..fd9d82c9033d 100644 --- a/drivers/scsi/hisi_sas/hisi_sas_main.c +++ b/drivers/scsi/hisi_sas/hisi_sas_main.c @@ -810,7 +810,8 @@ static void hisi_sas_do_release_task(struct hisi_hba *hisi_hba, struct sas_task spin_lock_irqsave(&task->task_state_lock, flags); task->task_state_flags &= ~(SAS_TASK_STATE_PENDING | SAS_TASK_AT_INITIATOR); - task->task_state_flags |= SAS_TASK_STATE_DONE; + if (!slot->is_internal && task->task_proto != SAS_PROTOCOL_SMP) + task->task_state_flags |= SAS_TASK_STATE_DONE; spin_unlock_irqrestore(&task->task_state_lock, flags); } -- 2.19.1