From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from bombadil.infradead.org (bombadil.infradead.org [198.137.202.133]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id EA285CAC59A for ; Fri, 19 Sep 2025 20:57:40 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=lists.infradead.org; s=bombadil.20210309; h=Sender:List-Subscribe:List-Help :List-Post:List-Archive:List-Unsubscribe:List-Id:Content-Transfer-Encoding: Content-Type:In-Reply-To:From:References:Cc:To:Subject:MIME-Version:Date: Message-ID:Reply-To:Content-ID:Content-Description:Resent-Date:Resent-From: Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID:List-Owner; bh=o9ku7JZxIEtLSd3q0U+dBteb+15Ur+wK9Hg4iSg4GiQ=; b=l3yHra5DfYajrW58cO4w7xzq8c fV1DkM7W1FJvJnOd0IExu7CBxdYgKBDsAKGU7XEgETKZkbWZoV7W1pUrgpx9Wn/LDkQzuxEtSL8GL QCTR+cg7yWU/U4+uOeoBxWZC9gKfPGCxf/lbVWRwH/msHKKDzoqlHfG/Wqgu1J9z6RCeC3E6KSJ3v jJB+uDHcYM+1QkVh/KCmJ4Ghp44jkZBFFcQiKU5jNQT8AALM9D3QS2Ng6VCvW0QYVaCdeHx7mStWe YzVnCedBcVQ899NnS+9pYSVaiNxLp/+tkHNtg32FYXDhFG9tgNuEcrqaaa63ND/cf7y2qFE3NCQgB 49qkczHg==; Received: from localhost ([::1] helo=bombadil.infradead.org) by bombadil.infradead.org with esmtp (Exim 4.98.2 #2 (Red Hat Linux)) id 1uziAk-000000041mc-37MT; Fri, 19 Sep 2025 20:57:38 +0000 Received: from 003.mia.mailroute.net ([199.89.3.6]) by bombadil.infradead.org with esmtps (Exim 4.98.2 #2 (Red Hat Linux)) id 1uziAi-000000041m8-2pTQ for linux-mediatek@lists.infradead.org; Fri, 19 Sep 2025 20:57:38 +0000 Received: from localhost (localhost [127.0.0.1]) by 003.mia.mailroute.net (Postfix) with ESMTP id 4cT4Zl2Mtczlmm7p; Fri, 19 Sep 2025 20:57:35 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=acm.org; h= content-transfer-encoding:content-type:content-type:in-reply-to :from:from:content-language:references:subject:subject :user-agent:mime-version:date:date:message-id:received:received; s=mr01; t=1758315452; x=1760907453; bh=o9ku7JZxIEtLSd3q0U+dBteb +15Ur+wK9Hg4iSg4GiQ=; b=pebarhp07tHaXV7iah8FS+Dphi+FoWMxJvjTpYd6 nEuyDmjDkZ6Q/sEedBsooJ6ynVliwSTuME+I8jMv+XohIiVUT7g7s+VQCnitjAEt OKcy1TfWmPCtl8lJ1Xho7vyJkSGZ+rs2sngYLo2GjPsggZkcZas9awQQww6MKCRf SRI7+WkKIfeYaE6szqeEvq3NypFJcjI75HmsSyFLoUcuVvE2kCJtJt0pbdktS9cU BHtGfVzmeYFvwf326T3/gpTRpzjEcHTxx6mlKE7leRaOcT0KEey1v3WFk5nuApHT YQrzfiBkxzMrCbrXg+UvMsxRJO53wGl0ZcGB3dp3c0ix2g== X-Virus-Scanned: by MailRoute Received: from 003.mia.mailroute.net ([127.0.0.1]) by localhost (003.mia [127.0.0.1]) (mroute_mailscanner, port 10029) with LMTP id BlR3mNq9qd8M; Fri, 19 Sep 2025 20:57:32 +0000 (UTC) Received: from [100.66.154.22] (unknown [104.135.204.82]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) (Authenticated sender: bvanassche@acm.org) by 003.mia.mailroute.net (Postfix) with ESMTPSA id 4cT4ZS2SZ2zlgqW0; Fri, 19 Sep 2025 20:57:19 +0000 (UTC) Message-ID: Date: Fri, 19 Sep 2025 13:57:18 -0700 MIME-Version: 1.0 User-Agent: Mozilla Thunderbird Subject: Re: [PATCH v1 01/10] ufs: host: mediatek: Fix runtime suspend error deadlock To: =?UTF-8?B?UGV0ZXIgV2FuZyAo546L5L+h5Y+LKQ==?= , "linux-scsi@vger.kernel.org" , "martin.petersen@oracle.com" Cc: =?UTF-8?B?QWxpY2UgQ2hhbyAo6LaZ54+u5Z2HKQ==?= , =?UTF-8?B?Q0MgQ2hvdSAo5ZGo5b+X5p2wKQ==?= , =?UTF-8?B?RWRkaWUgSHVhbmcgKOm7g+aZuuWCkSk=?= , =?UTF-8?B?RWQgVHNhaSAo6JSh5a6X6LuSKQ==?= , wsd_upstream , =?UTF-8?B?Q2hhb3RpYW4gSmluZyAo5LqV5pyd5aSpKQ==?= , =?UTF-8?B?Q2h1bi1IdW5nIFd1ICjlt6vpp7/lro8p?= , =?UTF-8?B?WWktZmFuIFBlbmcgKOW9ree+v+WHoSk=?= , =?UTF-8?B?UWlsaW4gVGFuICjosK3pupLpup8p?= , "linux-mediatek@lists.infradead.org" , =?UTF-8?B?SmlhamllIEhhbyAo6YOd5Yqg6IqCKQ==?= , =?UTF-8?B?TGluIEd1aSAo5qGC5p6XKQ==?= , =?UTF-8?B?TmFvbWkgQ2h1ICjmnLHoqaDnlLAp?= , =?UTF-8?B?VHVuLXl1IFl1ICjmuLjmlabogb8p?= References: <20250918104000.208856-1-peter.wang@mediatek.com> <20250918104000.208856-2-peter.wang@mediatek.com> <80a31144-852f-4df5-802e-a8c5d04a298a@acm.org> Content-Language: en-US From: Bart Van Assche In-Reply-To: Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: quoted-printable X-CRM114-Version: 20100106-BlameMichelson ( TRE 0.8.0 (BSD) ) MR-646709E3 X-CRM114-CacheID: sfid-20250919_135736_815935_80C2B067 X-CRM114-Status: GOOD ( 20.44 ) X-BeenThere: linux-mediatek@lists.infradead.org X-Mailman-Version: 2.1.34 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: "Linux-mediatek" Errors-To: linux-mediatek-bounces+linux-mediatek=archiver.kernel.org@lists.infradead.org On 9/19/25 1:11 AM, Peter Wang (=E7=8E=8B=E4=BF=A1=E5=8F=8B) wrote: > An error occurred during the suspend process, causing IO to hang. > This is because the error handler (eh) work is waiting for > resume, while the suspend work is waiting for the error handler > to finish before sending SSU. If the suspend callback waits for error handling to finish and the error handler waits until resuming has finished, isn't this an issue that can occur for any UFS host controller and hence that should be fixed in the UFSHCI driver core rather than in one host driver only? Why is the hba->pm_op_in_progress variable not sufficient to prevent this deadlock? Should this code perhaps be moved from ufshcd_eh_host_reset_handler() into ufshcd_err_handler()? /* * If runtime PM sent SSU and got a timeout, scsi_error_handler is * stuck in this function waiting for flush_work(&hba->eh_work). And * ufshcd_err_handler(eh_work) is stuck waiting for runtime PM. Do * ufshcd_link_recovery instead of eh_work to prevent deadlock. */ if (hba->pm_op_in_progress) { if (ufshcd_link_recovery(hba)) err =3D FAILED; return err; } >> How can ufs_mtk_suspend() be called while the error handler is in >> progress? ufshcd_err_handler() disables RPM before it sets the >> UFSHCD_EH_IN_PROGRESS flag. >=20 > This error is triggered by ufs_mtk_auto_hibern8_disable, > As the comment description > /* May trigger EH work without exiting hibern8 error */ > so it could happen during the suspend period. That source code comment is confusing me, especially the "without exiting hibern8 error" part. Do you really want to say that the device is in a hibernation error state and remains in a hibernation error state? >> The UFSHCD_EH_IN_PROGRESS definition and also the >> ufshcd_set_eh_in_progress() and ufshcd_clear_eh_in_progress() >> definitions must remain in the UFS core private code. Please do not >> move >> these definitions into the include/ufs/ufshcd.h header file. >=20 > Do you think we should check ufshcd_eh_in_progress in > __ufshcd_wl_suspend? I'm not sure, because we don't see this > error on all UFS hosts =E2=80=94 the vendor suspend operations > (ufshcd_vops_suspend) could be different. Why is auto-hibernation disabled during suspend? As far as I know the UFSHCI standard allows to keep auto-hibernation enabled during suspend. Thanks, Bart.