public inbox for linux-block@vger.kernel.org
 help / color / mirror / Atom feed
From: YangYang <yang.yang@vivo.com>
To: "Rafael J. Wysocki" <rafael@kernel.org>,
	Bart Van Assche <bvanassche@acm.org>
Cc: Jens Axboe <axboe@kernel.dk>, Pavel Machek <pavel@kernel.org>,
	Len Brown <lenb@kernel.org>,
	Greg Kroah-Hartman <gregkh@linuxfoundation.org>,
	Danilo Krummrich <dakr@kernel.org>,
	linux-block@vger.kernel.org, linux-kernel@vger.kernel.org,
	linux-pm@vger.kernel.org
Subject: Re: [PATCH 1/2] PM: runtime: Fix I/O hang due to race between resume and runtime disable
Date: Mon, 1 Dec 2025 17:46:45 +0800	[thread overview]
Message-ID: <82bcdf73-54c5-4220-86c0-540a5cb59bb7@vivo.com> (raw)
In-Reply-To: <CAJZ5v0hKe+2orwKP352dBe_PB1pZqMehMo8tSDv5G+cdaJ=OsQ@mail.gmail.com>

On 2025/11/27 20:34, Rafael J. Wysocki wrote:
> On Wed, Nov 26, 2025 at 11:47 PM Bart Van Assche <bvanassche@acm.org> wrote:
>>
>> On 11/26/25 1:30 PM, Rafael J. Wysocki wrote:
>>> On Wed, Nov 26, 2025 at 10:11 PM Bart Van Assche <bvanassche@acm.org> wrote:
>>>>
>>>> On 11/26/25 12:17 PM, Rafael J. Wysocki wrote:
>>>>> --- a/block/blk-core.c
>>>>> +++ b/block/blk-core.c
>>>>> @@ -309,6 +309,8 @@ int blk_queue_enter(struct request_queue
>>>>>                 if (flags & BLK_MQ_REQ_NOWAIT)
>>>>>                         return -EAGAIN;
>>>>>
>>>>> +             /* if necessary, resume .dev (assume success). */
>>>>> +             blk_pm_resume_queue(pm, q);
>>>>>                 /*
>>>>>                  * read pair of barrier in blk_freeze_queue_start(), we need to
>>>>>                  * order reading __PERCPU_REF_DEAD flag of .q_usage_counter and
>>>>
>>>> blk_queue_enter() may be called from the suspend path so I don't think
>>>> that the above change will work.
>>>
>>> Why would the existing code work then?
>>
>> The existing code works reliably on a very large number of devices.
> 
> Well, except that it doesn't work during system suspend and
> hibernation when the PM workqueue is frozen.  I think that we agree
> here.
> 
> This needs to be addressed because it may very well cause system
> suspend to deadlock.
> 
> There are two possible ways to address it I can think of:
> 
> 1. Changing blk_pm_resume_queue() and its users to carry out a
> synchronous resume of q->dev instead of calling pm_request_resume()
> and (effectively) waiting for the queued-up runtime resume of q->dev
> to take effect.
> 
> This would be my preferred option, but at this point I'm not sure if
> it's viable.
> 

After __pm_runtime_disable() is called from device_suspend_late(), 
dev->power.disable_depth is set, preventing rpm_resume() from making 
progress until the system resume completes, regardless of whether 
rpm_resume() is invoked synchronously or asynchronously.
Performing a synchronous resume of q->dev seems to have a similar 
effect to removing the following code block from 
__pm_runtime_barrier(), which is invoked by __pm_runtime_disable():

1428     if (dev->power.request_pending) {
1429         dev->power.request = RPM_REQ_NONE;
1430         spin_unlock_irq(&dev->power.lock);
1431
1432         cancel_work_sync(&dev->power.work);
1433
1434         spin_lock_irq(&dev->power.lock);
1435         dev->power.request_pending = false;
1436     }

> 2. Stop freezing the PM workqueue before system suspend/hibernation
> and adapt device_suspend_late() to that.
> 
> This should be doable, even though it is a bit risky because it may
> uncover some latent bugs (the freezing of the PM workqueue has been
> there forever), but it wouldn't address the problem entirely because
> device_suspend_late() would still need to disable runtime PM for the
> device (and for some devices it is disabled earlier), so
> pm_request_resume() would just start to fail at that point and if
> blk_queue_enter() were called after that point for a device supporting
> runtime PM, it might deadlock.
> 
>> Maybe there is a misunderstanding? RQF_PM / BLK_MQ_REQ_PM are set for
>> requests that should be processed even if the power status is changing
>> (RPM_SUSPENDING or RPM_RESUMING). The meaning of the 'pm' variable is
>> as follows: process this request even if a power state change is
>> ongoing.
> 
> I see.
> 
> The behavior depends on whether or not q->pm_only is set.  If it is
> not set, both blk_queue_enter() and __bio_queue_enter() will allow the
> request to be processed.
> 
> If q->pm_only is set, __bio_queue_enter() will wait until it gets
> cleared and in that case pm_request_resume(q->dev) is called to make
> that happen (did I get it right?).  This is a bit fragile because what
> if the async resume of q->dev fails for some reason?  You deadlock
> instead of failing the request.
> 
> Unlike __bio_queue_enter(), blk_queue_enter() additionally checks the
> runtime PM status of the queue if q->pm_only is set and it will allow
> the request to be processed in that case so long as q->rpm_status is
> not RPM_SUSPENDED.  However, if the queue status is RPM_SUSPENDED,
> pm_request_resume(q->dev) will be called like in the
> __bio_queue_enter() case.
> 
> I'm not sure why pm_request_resume(q->dev) needs to be called from
> within blk_pm_resume_queue().  Arguably, it should be sufficient to
> call it once before using the wait_event() macro, if the conditions
> checked by blk_pm_resume_queue() are not met.
> 
>>> Are you suggesting that q->rpm_status should still be checked before
>>> calling pm_runtime_resume() or do you mean something else?
>> The purpose of the code changes from a previous email is not entirely
>> clear to me so I'm not sure what the code should look like. But to
>> answer your question, calling blk_pm_resume_queue() if the runtime
>> status is RPM_SUSPENDED should be safe.
>>>> As an example, the UFS driver submits a
>>>> SCSI START STOP UNIT command from its runtime suspend callback. The call
>>>> chain is as follows:
>>>>
>>>>      ufshcd_wl_runtime_suspend()
>>>>        __ufshcd_wl_suspend()
>>>>          ufshcd_set_dev_pwr_mode()
>>>>            ufshcd_execute_start_stop()
>>>>              scsi_execute_cmd()
>>>>                scsi_alloc_request()
>>>>                  blk_queue_enter()
>>>>                blk_execute_rq()
>>>>                blk_mq_free_request()
>>>>                  blk_queue_exit()
>>>
>>> In any case, calling pm_request_resume() from blk_pm_resume_queue() in
>>> the !pm case is a mistake.
>>    Hmm ... we may disagree about this. Does what I wrote above make clear
>> why blk_pm_resume_queue() is called if pm == false?
> 
> Yes, it does, thanks!


  reply	other threads:[~2025-12-01  9:46 UTC|newest]

Thread overview: 44+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-11-26 10:16 [PATCH 0/2] PM: runtime: Fix potential I/O hang Yang Yang
2025-11-26 10:16 ` [PATCH 1/2] PM: runtime: Fix I/O hang due to race between resume and runtime disable Yang Yang
2025-11-26 11:30   ` Rafael J. Wysocki
2025-11-26 11:59     ` YangYang
2025-11-26 12:36       ` Rafael J. Wysocki
2025-11-26 15:33         ` Bart Van Assche
2025-11-26 15:41           ` Rafael J. Wysocki
2025-11-26 18:40             ` Bart Van Assche
2025-11-27 11:29               ` YangYang
2025-11-27 12:44                 ` Rafael J. Wysocki
2025-11-28  7:20                   ` YangYang
2025-12-01 16:40                 ` Bart Van Assche
2025-11-26 18:06     ` Bart Van Assche
2025-11-26 19:16       ` Rafael J. Wysocki
2025-11-26 19:34         ` Rafael J. Wysocki
2025-11-26 20:17           ` Rafael J. Wysocki
2025-11-26 21:10             ` Bart Van Assche
2025-11-26 21:30               ` Rafael J. Wysocki
2025-11-26 22:47                 ` Bart Van Assche
2025-11-27 12:34                   ` Rafael J. Wysocki
2025-12-01  9:46                     ` YangYang [this message]
2025-12-01 12:56                       ` YangYang
2025-12-01 18:55                         ` Rafael J. Wysocki
2025-12-02 10:33                           ` YangYang
2025-12-02 12:18                             ` Rafael J. Wysocki
2025-12-01 18:47                       ` Rafael J. Wysocki
2025-12-01 19:58                         ` [PATCH v1] PM: sleep: Do not flag runtime PM workqueue as freezable Rafael J. Wysocki
2025-12-02  1:06                           ` Bart Van Assche
2025-12-02 11:53                             ` Rafael J. Wysocki
2025-12-02 13:29                               ` Rafael J. Wysocki
2025-12-02 10:36                           ` YangYang
2025-12-02 14:58                           ` Ulf Hansson
2025-12-02  0:40                         ` [PATCH 1/2] PM: runtime: Fix I/O hang due to race between resume and runtime disable Bart Van Assche
2025-12-02 12:14                           ` Rafael J. Wysocki
2025-12-02 13:37                             ` Rafael J. Wysocki
2025-12-05 15:24                         ` [PATCH v2] PM: sleep: Do not flag runtime PM workqueue as freezable Rafael J. Wysocki
2025-12-05 19:10                           ` Bart Van Assche
2025-12-07 11:23                             ` Rafael J. Wysocki
2025-11-26 10:16 ` [PATCH 2/2] blk-mq: Fix I/O hang caused by incomplete device resume Yang Yang
2025-11-26 11:31 ` [PATCH 0/2] PM: runtime: Fix potential I/O hang Rafael J. Wysocki
2025-11-26 15:48   ` Bart Van Assche
2025-11-26 16:59     ` Rafael J. Wysocki
2025-11-26 17:21       ` Rafael J. Wysocki
2025-11-26 17:34         ` Rafael J. Wysocki

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=82bcdf73-54c5-4220-86c0-540a5cb59bb7@vivo.com \
    --to=yang.yang@vivo.com \
    --cc=axboe@kernel.dk \
    --cc=bvanassche@acm.org \
    --cc=dakr@kernel.org \
    --cc=gregkh@linuxfoundation.org \
    --cc=lenb@kernel.org \
    --cc=linux-block@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-pm@vger.kernel.org \
    --cc=pavel@kernel.org \
    --cc=rafael@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox