public inbox for linux-mediatek@lists.infradead.org
 help / color / mirror / Atom feed
From: Bart Van Assche <bvanassche@acm.org>
To: "Peter Wang (王信友)" <peter.wang@mediatek.com>,
	"linux-scsi@vger.kernel.org" <linux-scsi@vger.kernel.org>,
	"martin.petersen@oracle.com" <martin.petersen@oracle.com>
Cc: "Alice Chao (趙珮均)" <Alice.Chao@mediatek.com>,
	"CC Chou (周志杰)" <cc.chou@mediatek.com>,
	"Eddie Huang (黃智傑)" <eddie.huang@mediatek.com>,
	"Ed Tsai (蔡宗軒)" <Ed.Tsai@mediatek.com>,
	wsd_upstream <wsd_upstream@mediatek.com>,
	"Chaotian Jing (井朝天)" <Chaotian.Jing@mediatek.com>,
	"Chun-Hung Wu (巫駿宏)" <Chun-hung.Wu@mediatek.com>,
	"Yi-fan Peng (彭羿凡)" <Yi-fan.Peng@mediatek.com>,
	"Qilin Tan (谭麒麟)" <Qilin.Tan@mediatek.com>,
	"linux-mediatek@lists.infradead.org"
	<linux-mediatek@lists.infradead.org>,
	"Jiajie Hao (郝加节)" <jiajie.hao@mediatek.com>,
	"Lin Gui (桂林)" <Lin.Gui@mediatek.com>,
	"Naomi Chu (朱詠田)" <Naomi.Chu@mediatek.com>,
	"Tun-yu Yu (游敦聿)" <Tun-yu.Yu@mediatek.com>
Subject: Re: [PATCH v1 01/10] ufs: host: mediatek: Fix runtime suspend error deadlock
Date: Fri, 19 Sep 2025 13:57:18 -0700	[thread overview]
Message-ID: <bc612c10-a4eb-41ab-b8e5-726d22935518@acm.org> (raw)
In-Reply-To: <bdb6ee1402ae4c9ba8919011b1d8fcb9d984129f.camel@mediatek.com>

On 9/19/25 1:11 AM, Peter Wang (王信友) wrote:
> An error occurred during the suspend process, causing IO to hang.
> This is because the error handler (eh) work is waiting for
> resume, while the suspend work is waiting for the error handler
> to finish before sending SSU.

If the suspend callback waits for error handling to finish and the
error handler waits until resuming has finished, isn't this an issue
that can occur for any UFS host controller and hence that should be
fixed in the UFSHCI driver core rather than in one host driver only?

Why is the hba->pm_op_in_progress variable not sufficient to prevent
this deadlock? Should this code perhaps be moved from
ufshcd_eh_host_reset_handler() into ufshcd_err_handler()?

	/*
	 * If runtime PM sent SSU and got a timeout, scsi_error_handler is
	 * stuck in this function waiting for flush_work(&hba->eh_work). And
	 * ufshcd_err_handler(eh_work) is stuck waiting for runtime PM. Do
	 * ufshcd_link_recovery instead of eh_work to prevent deadlock.
	 */
	if (hba->pm_op_in_progress) {
		if (ufshcd_link_recovery(hba))
			err = FAILED;

		return err;
	}

>> How can ufs_mtk_suspend() be called while the error handler is in
>> progress? ufshcd_err_handler() disables RPM before it sets the
>> UFSHCD_EH_IN_PROGRESS flag.
> 
> This error is triggered by ufs_mtk_auto_hibern8_disable,
> As the comment description
> /* May trigger EH work without exiting hibern8 error */
> so it could happen during the suspend period.

That source code comment is confusing me, especially the "without
exiting hibern8 error" part. Do you really want to say that the device
is in a hibernation error state and remains in a hibernation error
state?

>> The UFSHCD_EH_IN_PROGRESS definition and also the
>> ufshcd_set_eh_in_progress() and ufshcd_clear_eh_in_progress()
>> definitions must remain in the UFS core private code. Please do not
>> move
>> these definitions into the include/ufs/ufshcd.h header file.
> 
> Do you think we should check ufshcd_eh_in_progress in
> __ufshcd_wl_suspend? I'm not sure, because we don't see this
> error on all UFS hosts — the vendor suspend operations
> (ufshcd_vops_suspend) could be different.

Why is auto-hibernation disabled during suspend? As far as I know the
UFSHCI standard allows to keep auto-hibernation enabled during suspend.

Thanks,

Bart.


  reply	other threads:[~2025-09-19 20:57 UTC|newest]

Thread overview: 37+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-09-18 10:36 [PATCH v1 00/10] Enhance UFS Mediatek Driver peter.wang
2025-09-18 10:36 ` [PATCH v1 01/10] ufs: host: mediatek: Fix runtime suspend error deadlock peter.wang
2025-09-18 18:27   ` Bart Van Assche
2025-09-19  8:11     ` Peter Wang (王信友)
2025-09-19 20:57       ` Bart Van Assche [this message]
2025-09-22  8:37         ` Peter Wang (王信友)
2025-09-22 18:27           ` Bart Van Assche
2025-09-23  5:56             ` Peter Wang (王信友)
2025-09-18 10:36 ` [PATCH v1 02/10] ufs: host: mediatek: Correct clock scaling with PM QoS flow peter.wang
2025-09-18 18:30   ` Bart Van Assche
2025-09-19  8:11     ` Peter Wang (王信友)
2025-09-19 21:02       ` Bart Van Assche
2025-09-22  8:39         ` Peter Wang (王信友)
2025-09-22 19:21           ` Bart Van Assche
2025-09-23  5:58             ` Peter Wang (王信友)
2025-09-18 10:36 ` [PATCH v1 03/10] ufs: host: mediatek: Adjust clock scaling for PM flow peter.wang
2025-09-18 10:36 ` [PATCH v1 04/10] ufs: host: mediatek: Handle clock scaling for high gear in " peter.wang
2025-09-18 10:36 ` [PATCH v1 05/10] ufs: host: mediatek: Adjust sync length for FASTAUTO mode peter.wang
2025-09-18 19:28   ` Bart Van Assche
2025-09-19  8:12     ` Peter Wang (王信友)
2025-09-18 10:36 ` [PATCH v1 06/10] ufs: host: mediatek: Enable interrupts for MCQ mode peter.wang
2025-09-18 18:34   ` Bart Van Assche
2025-09-19  8:14     ` Peter Wang (王信友)
2025-09-19 21:09       ` Bart Van Assche
2025-09-22  8:41         ` Peter Wang (王信友)
2025-09-22 19:26           ` Bart Van Assche
2025-09-23  5:59             ` Peter Wang (王信友)
2025-09-18 10:36 ` [PATCH v1 07/10] ufs: host: mediatek: Fix shutdown/suspend race condition peter.wang
2025-09-18 18:39   ` Bart Van Assche
2025-09-19  8:15     ` Peter Wang (王信友)
2025-09-19 21:10       ` Bart Van Assche
2025-09-18 10:36 ` [PATCH v1 08/10] ufs: host: mediatek: Remove duplicate function peter.wang
2025-09-18 19:29   ` Bart Van Assche
2025-09-18 10:36 ` [PATCH v1 09/10] ufs: host: mediatek: Add support for new platform with MMIO_OTSD_CTR peter.wang
2025-09-18 10:36 ` [PATCH v1 10/10] ufs: host: mediatek: Support new feature for MT6991 peter.wang
2025-09-18 19:32   ` Bart Van Assche
2025-09-19  8:17     ` Peter Wang (王信友)

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=bc612c10-a4eb-41ab-b8e5-726d22935518@acm.org \
    --to=bvanassche@acm.org \
    --cc=Alice.Chao@mediatek.com \
    --cc=Chaotian.Jing@mediatek.com \
    --cc=Chun-hung.Wu@mediatek.com \
    --cc=Ed.Tsai@mediatek.com \
    --cc=Lin.Gui@mediatek.com \
    --cc=Naomi.Chu@mediatek.com \
    --cc=Qilin.Tan@mediatek.com \
    --cc=Tun-yu.Yu@mediatek.com \
    --cc=Yi-fan.Peng@mediatek.com \
    --cc=cc.chou@mediatek.com \
    --cc=eddie.huang@mediatek.com \
    --cc=jiajie.hao@mediatek.com \
    --cc=linux-mediatek@lists.infradead.org \
    --cc=linux-scsi@vger.kernel.org \
    --cc=martin.petersen@oracle.com \
    --cc=peter.wang@mediatek.com \
    --cc=wsd_upstream@mediatek.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox