* [PATCH 0/2] PM: runtime: Fix potential I/O hang
@ 2025-11-26 10:16 Yang Yang
2025-11-26 10:16 ` [PATCH 1/2] PM: runtime: Fix I/O hang due to race between resume and runtime disable Yang Yang
` (2 more replies)
0 siblings, 3 replies; 44+ messages in thread
From: Yang Yang @ 2025-11-26 10:16 UTC (permalink / raw)
To: Jens Axboe, Rafael J. Wysocki, Pavel Machek, Len Brown,
Greg Kroah-Hartman, Danilo Krummrich, linux-block, linux-kernel,
linux-pm
Cc: Yang Yang
Yang Yang (2):
PM: runtime: Fix I/O hang due to race between resume and runtime
disable
blk-mq: Fix I/O hang caused by incomplete device resume
block/blk-pm.c | 1 +
drivers/base/power/runtime.c | 3 ++-
include/linux/pm.h | 1 +
3 files changed, 4 insertions(+), 1 deletion(-)
--
2.34.1
^ permalink raw reply [flat|nested] 44+ messages in thread* [PATCH 1/2] PM: runtime: Fix I/O hang due to race between resume and runtime disable 2025-11-26 10:16 [PATCH 0/2] PM: runtime: Fix potential I/O hang Yang Yang @ 2025-11-26 10:16 ` Yang Yang 2025-11-26 11:30 ` Rafael J. Wysocki 2025-11-26 10:16 ` [PATCH 2/2] blk-mq: Fix I/O hang caused by incomplete device resume Yang Yang 2025-11-26 11:31 ` [PATCH 0/2] PM: runtime: Fix potential I/O hang Rafael J. Wysocki 2 siblings, 1 reply; 44+ messages in thread From: Yang Yang @ 2025-11-26 10:16 UTC (permalink / raw) To: Jens Axboe, Rafael J. Wysocki, Pavel Machek, Len Brown, Greg Kroah-Hartman, Danilo Krummrich, linux-block, linux-kernel, linux-pm Cc: Yang Yang We observed the following hung task during our test: [ 3987.095999] INFO: task "kworker/u32:7":239 blocked for more than 188 seconds. [ 3987.096017] task:kworker/u32:7 state:D stack:0 pid:239 tgid:239 ppid:2 flags:0x00000408 [ 3987.096042] Workqueue: writeback wb_workfn (flush-254:59) [ 3987.096069] Call trace: [ 3987.096073] __switch_to+0x1a0/0x318 [ 3987.096089] __schedule+0xa38/0xf9c [ 3987.096104] schedule+0x74/0x10c [ 3987.096118] __bio_queue_enter+0xb8/0x178 [ 3987.096132] blk_mq_submit_bio+0x104/0x728 [ 3987.096145] __submit_bio+0xa0/0x23c [ 3987.096159] submit_bio_noacct_nocheck+0x164/0x330 [ 3987.096173] submit_bio_noacct+0x348/0x468 [ 3987.096186] submit_bio+0x17c/0x198 [ 3987.096199] f2fs_submit_write_bio+0x44/0xe8 [ 3987.096211] __submit_merged_bio+0x40/0x11c [ 3987.096222] __submit_merged_write_cond+0xcc/0x1f8 [ 3987.096233] f2fs_write_data_pages+0xbb8/0xd0c [ 3987.096246] do_writepages+0xe0/0x2f4 [ 3987.096255] __writeback_single_inode+0x44/0x4ac [ 3987.096272] writeback_sb_inodes+0x30c/0x538 [ 3987.096289] __writeback_inodes_wb+0x9c/0xec [ 3987.096305] wb_writeback+0x158/0x440 [ 3987.096321] wb_workfn+0x388/0x5d4 [ 3987.096335] process_scheduled_works+0x1c4/0x45c [ 3987.096346] worker_thread+0x32c/0x3e8 [ 3987.096356] kthread+0x11c/0x1b0 [ 3987.096372] ret_from_fork+0x10/0x20 T1: T2: blk_queue_enter blk_pm_resume_queue pm_request_resume __pm_runtime_resume(dev, RPM_ASYNC) rpm_resume __pm_runtime_disable dev->power.request_pending = true dev->power.disable_depth++ queue_work(pm_wq, &dev->power.work) __pm_runtime_barrier wait_event cancel_work_sync(&dev->power.work) T1 queues the work item, which is then cancelled by T2 before it starts execution. As a result, q->dev cannot be resumed, and T1 waits here for a long time. Signed-off-by: Yang Yang <yang.yang@vivo.com> --- drivers/base/power/runtime.c | 3 ++- include/linux/pm.h | 1 + 2 files changed, 3 insertions(+), 1 deletion(-) diff --git a/drivers/base/power/runtime.c b/drivers/base/power/runtime.c index 1b11a3cd4acc..fc9bf3fb3bb7 100644 --- a/drivers/base/power/runtime.c +++ b/drivers/base/power/runtime.c @@ -1533,7 +1533,8 @@ void __pm_runtime_disable(struct device *dev, bool check_resume) * means there probably is some I/O to process and disabling runtime PM * shouldn't prevent the device from processing the I/O. */ - if (check_resume && dev->power.request_pending && + if ((check_resume || dev->power.force_check_resume) && + dev->power.request_pending && dev->power.request == RPM_REQ_RESUME) { /* * Prevent suspends and idle notifications from being carried diff --git a/include/linux/pm.h b/include/linux/pm.h index cc7b2dc28574..4eb20569cdbc 100644 --- a/include/linux/pm.h +++ b/include/linux/pm.h @@ -708,6 +708,7 @@ struct dev_pm_info { bool use_autosuspend:1; bool timer_autosuspends:1; bool memalloc_noio:1; + bool force_check_resume:1; unsigned int links_count; enum rpm_request request; enum rpm_status runtime_status; -- 2.34.1 ^ permalink raw reply related [flat|nested] 44+ messages in thread
* Re: [PATCH 1/2] PM: runtime: Fix I/O hang due to race between resume and runtime disable 2025-11-26 10:16 ` [PATCH 1/2] PM: runtime: Fix I/O hang due to race between resume and runtime disable Yang Yang @ 2025-11-26 11:30 ` Rafael J. Wysocki 2025-11-26 11:59 ` YangYang 2025-11-26 18:06 ` Bart Van Assche 0 siblings, 2 replies; 44+ messages in thread From: Rafael J. Wysocki @ 2025-11-26 11:30 UTC (permalink / raw) To: Yang Yang Cc: Jens Axboe, Rafael J. Wysocki, Pavel Machek, Len Brown, Greg Kroah-Hartman, Danilo Krummrich, linux-block, linux-kernel, linux-pm On Wed, Nov 26, 2025 at 11:17 AM Yang Yang <yang.yang@vivo.com> wrote: > > We observed the following hung task during our test: > > [ 3987.095999] INFO: task "kworker/u32:7":239 blocked for more than 188 seconds. > [ 3987.096017] task:kworker/u32:7 state:D stack:0 pid:239 tgid:239 ppid:2 flags:0x00000408 > [ 3987.096042] Workqueue: writeback wb_workfn (flush-254:59) > [ 3987.096069] Call trace: > [ 3987.096073] __switch_to+0x1a0/0x318 > [ 3987.096089] __schedule+0xa38/0xf9c > [ 3987.096104] schedule+0x74/0x10c > [ 3987.096118] __bio_queue_enter+0xb8/0x178 > [ 3987.096132] blk_mq_submit_bio+0x104/0x728 > [ 3987.096145] __submit_bio+0xa0/0x23c > [ 3987.096159] submit_bio_noacct_nocheck+0x164/0x330 > [ 3987.096173] submit_bio_noacct+0x348/0x468 > [ 3987.096186] submit_bio+0x17c/0x198 > [ 3987.096199] f2fs_submit_write_bio+0x44/0xe8 > [ 3987.096211] __submit_merged_bio+0x40/0x11c > [ 3987.096222] __submit_merged_write_cond+0xcc/0x1f8 > [ 3987.096233] f2fs_write_data_pages+0xbb8/0xd0c > [ 3987.096246] do_writepages+0xe0/0x2f4 > [ 3987.096255] __writeback_single_inode+0x44/0x4ac > [ 3987.096272] writeback_sb_inodes+0x30c/0x538 > [ 3987.096289] __writeback_inodes_wb+0x9c/0xec > [ 3987.096305] wb_writeback+0x158/0x440 > [ 3987.096321] wb_workfn+0x388/0x5d4 > [ 3987.096335] process_scheduled_works+0x1c4/0x45c > [ 3987.096346] worker_thread+0x32c/0x3e8 > [ 3987.096356] kthread+0x11c/0x1b0 > [ 3987.096372] ret_from_fork+0x10/0x20 > > T1: T2: > blk_queue_enter > blk_pm_resume_queue > pm_request_resume Shouldn't this be pm_runtime_resume() rather? > __pm_runtime_resume(dev, RPM_ASYNC) > rpm_resume __pm_runtime_disable > dev->power.request_pending = true dev->power.disable_depth++ > queue_work(pm_wq, &dev->power.work) __pm_runtime_barrier > wait_event cancel_work_sync(&dev->power.work) > > T1 queues the work item, which is then cancelled by T2 before it starts > execution. As a result, q->dev cannot be resumed, and T1 waits here for > a long time. > > Signed-off-by: Yang Yang <yang.yang@vivo.com> > --- > drivers/base/power/runtime.c | 3 ++- > include/linux/pm.h | 1 + > 2 files changed, 3 insertions(+), 1 deletion(-) > > diff --git a/drivers/base/power/runtime.c b/drivers/base/power/runtime.c > index 1b11a3cd4acc..fc9bf3fb3bb7 100644 > --- a/drivers/base/power/runtime.c > +++ b/drivers/base/power/runtime.c > @@ -1533,7 +1533,8 @@ void __pm_runtime_disable(struct device *dev, bool check_resume) > * means there probably is some I/O to process and disabling runtime PM > * shouldn't prevent the device from processing the I/O. > */ > - if (check_resume && dev->power.request_pending && > + if ((check_resume || dev->power.force_check_resume) && > + dev->power.request_pending && > dev->power.request == RPM_REQ_RESUME) { > /* > * Prevent suspends and idle notifications from being carried There are only two cases in which false is passed to __pm_runtime_disable(), one is in device_suspend_late() and I don't think that's relevant here, and the other is in pm_runtime_remove() and that's get called when the device is going away. So apparently, blk_pm_resume_queue() races with the device going away. Is this expected to happen even? If so, wouldn't it be better to modify pm_runtime_remove() to pass true to __pm_runtime_disable() instead of making these ad hoc changes? > diff --git a/include/linux/pm.h b/include/linux/pm.h > index cc7b2dc28574..4eb20569cdbc 100644 > --- a/include/linux/pm.h > +++ b/include/linux/pm.h > @@ -708,6 +708,7 @@ struct dev_pm_info { > bool use_autosuspend:1; > bool timer_autosuspends:1; > bool memalloc_noio:1; > + bool force_check_resume:1; > unsigned int links_count; > enum rpm_request request; > enum rpm_status runtime_status; > -- ^ permalink raw reply [flat|nested] 44+ messages in thread
* Re: [PATCH 1/2] PM: runtime: Fix I/O hang due to race between resume and runtime disable 2025-11-26 11:30 ` Rafael J. Wysocki @ 2025-11-26 11:59 ` YangYang 2025-11-26 12:36 ` Rafael J. Wysocki 2025-11-26 18:06 ` Bart Van Assche 1 sibling, 1 reply; 44+ messages in thread From: YangYang @ 2025-11-26 11:59 UTC (permalink / raw) To: Rafael J. Wysocki Cc: Jens Axboe, Pavel Machek, Len Brown, Greg Kroah-Hartman, Danilo Krummrich, linux-block, linux-kernel, linux-pm On 2025/11/26 19:30, Rafael J. Wysocki wrote: > On Wed, Nov 26, 2025 at 11:17 AM Yang Yang <yang.yang@vivo.com> wrote: >> >> We observed the following hung task during our test: >> >> [ 3987.095999] INFO: task "kworker/u32:7":239 blocked for more than 188 seconds. >> [ 3987.096017] task:kworker/u32:7 state:D stack:0 pid:239 tgid:239 ppid:2 flags:0x00000408 >> [ 3987.096042] Workqueue: writeback wb_workfn (flush-254:59) >> [ 3987.096069] Call trace: >> [ 3987.096073] __switch_to+0x1a0/0x318 >> [ 3987.096089] __schedule+0xa38/0xf9c >> [ 3987.096104] schedule+0x74/0x10c >> [ 3987.096118] __bio_queue_enter+0xb8/0x178 >> [ 3987.096132] blk_mq_submit_bio+0x104/0x728 >> [ 3987.096145] __submit_bio+0xa0/0x23c >> [ 3987.096159] submit_bio_noacct_nocheck+0x164/0x330 >> [ 3987.096173] submit_bio_noacct+0x348/0x468 >> [ 3987.096186] submit_bio+0x17c/0x198 >> [ 3987.096199] f2fs_submit_write_bio+0x44/0xe8 >> [ 3987.096211] __submit_merged_bio+0x40/0x11c >> [ 3987.096222] __submit_merged_write_cond+0xcc/0x1f8 >> [ 3987.096233] f2fs_write_data_pages+0xbb8/0xd0c >> [ 3987.096246] do_writepages+0xe0/0x2f4 >> [ 3987.096255] __writeback_single_inode+0x44/0x4ac >> [ 3987.096272] writeback_sb_inodes+0x30c/0x538 >> [ 3987.096289] __writeback_inodes_wb+0x9c/0xec >> [ 3987.096305] wb_writeback+0x158/0x440 >> [ 3987.096321] wb_workfn+0x388/0x5d4 >> [ 3987.096335] process_scheduled_works+0x1c4/0x45c >> [ 3987.096346] worker_thread+0x32c/0x3e8 >> [ 3987.096356] kthread+0x11c/0x1b0 >> [ 3987.096372] ret_from_fork+0x10/0x20 >> >> T1: T2: >> blk_queue_enter >> blk_pm_resume_queue >> pm_request_resume > > Shouldn't this be pm_runtime_resume() rather? I'm not sure about that, I'll check if pm_runtime_resume() should be used here instead. > >> __pm_runtime_resume(dev, RPM_ASYNC) >> rpm_resume __pm_runtime_disable >> dev->power.request_pending = true dev->power.disable_depth++ >> queue_work(pm_wq, &dev->power.work) __pm_runtime_barrier >> wait_event cancel_work_sync(&dev->power.work) >> >> T1 queues the work item, which is then cancelled by T2 before it starts >> execution. As a result, q->dev cannot be resumed, and T1 waits here for >> a long time. >> >> Signed-off-by: Yang Yang <yang.yang@vivo.com> >> --- >> drivers/base/power/runtime.c | 3 ++- >> include/linux/pm.h | 1 + >> 2 files changed, 3 insertions(+), 1 deletion(-) >> >> diff --git a/drivers/base/power/runtime.c b/drivers/base/power/runtime.c >> index 1b11a3cd4acc..fc9bf3fb3bb7 100644 >> --- a/drivers/base/power/runtime.c >> +++ b/drivers/base/power/runtime.c >> @@ -1533,7 +1533,8 @@ void __pm_runtime_disable(struct device *dev, bool check_resume) >> * means there probably is some I/O to process and disabling runtime PM >> * shouldn't prevent the device from processing the I/O. >> */ >> - if (check_resume && dev->power.request_pending && >> + if ((check_resume || dev->power.force_check_resume) && >> + dev->power.request_pending && >> dev->power.request == RPM_REQ_RESUME) { >> /* >> * Prevent suspends and idle notifications from being carried > > There are only two cases in which false is passed to > __pm_runtime_disable(), one is in device_suspend_late() and I don't > think that's relevant here, and the other is in pm_runtime_remove() > and that's get called when the device is going away. > > So apparently, blk_pm_resume_queue() races with the device going away. > Is this expected to happen even? > > If so, wouldn't it be better to modify pm_runtime_remove() to pass > true to __pm_runtime_disable() instead of making these ad hoc changes? > Sorry, I didn't make it clear in my previous message. I can confirm that __pm_runtime_disable() is called from device_suspend_late(), and this issue occurs during system suspend. >> diff --git a/include/linux/pm.h b/include/linux/pm.h >> index cc7b2dc28574..4eb20569cdbc 100644 >> --- a/include/linux/pm.h >> +++ b/include/linux/pm.h >> @@ -708,6 +708,7 @@ struct dev_pm_info { >> bool use_autosuspend:1; >> bool timer_autosuspends:1; >> bool memalloc_noio:1; >> + bool force_check_resume:1; >> unsigned int links_count; >> enum rpm_request request; >> enum rpm_status runtime_status; >> -- ^ permalink raw reply [flat|nested] 44+ messages in thread
* Re: [PATCH 1/2] PM: runtime: Fix I/O hang due to race between resume and runtime disable 2025-11-26 11:59 ` YangYang @ 2025-11-26 12:36 ` Rafael J. Wysocki 2025-11-26 15:33 ` Bart Van Assche 0 siblings, 1 reply; 44+ messages in thread From: Rafael J. Wysocki @ 2025-11-26 12:36 UTC (permalink / raw) To: YangYang Cc: Rafael J. Wysocki, Jens Axboe, Pavel Machek, Len Brown, Greg Kroah-Hartman, Danilo Krummrich, linux-block, linux-kernel, linux-pm On Wed, Nov 26, 2025 at 12:59 PM YangYang <yang.yang@vivo.com> wrote: > > On 2025/11/26 19:30, Rafael J. Wysocki wrote: > > On Wed, Nov 26, 2025 at 11:17 AM Yang Yang <yang.yang@vivo.com> wrote: > >> > >> We observed the following hung task during our test: > >> > >> [ 3987.095999] INFO: task "kworker/u32:7":239 blocked for more than 188 seconds. > >> [ 3987.096017] task:kworker/u32:7 state:D stack:0 pid:239 tgid:239 ppid:2 flags:0x00000408 > >> [ 3987.096042] Workqueue: writeback wb_workfn (flush-254:59) > >> [ 3987.096069] Call trace: > >> [ 3987.096073] __switch_to+0x1a0/0x318 > >> [ 3987.096089] __schedule+0xa38/0xf9c > >> [ 3987.096104] schedule+0x74/0x10c > >> [ 3987.096118] __bio_queue_enter+0xb8/0x178 > >> [ 3987.096132] blk_mq_submit_bio+0x104/0x728 > >> [ 3987.096145] __submit_bio+0xa0/0x23c > >> [ 3987.096159] submit_bio_noacct_nocheck+0x164/0x330 > >> [ 3987.096173] submit_bio_noacct+0x348/0x468 > >> [ 3987.096186] submit_bio+0x17c/0x198 > >> [ 3987.096199] f2fs_submit_write_bio+0x44/0xe8 > >> [ 3987.096211] __submit_merged_bio+0x40/0x11c > >> [ 3987.096222] __submit_merged_write_cond+0xcc/0x1f8 > >> [ 3987.096233] f2fs_write_data_pages+0xbb8/0xd0c > >> [ 3987.096246] do_writepages+0xe0/0x2f4 > >> [ 3987.096255] __writeback_single_inode+0x44/0x4ac > >> [ 3987.096272] writeback_sb_inodes+0x30c/0x538 > >> [ 3987.096289] __writeback_inodes_wb+0x9c/0xec > >> [ 3987.096305] wb_writeback+0x158/0x440 > >> [ 3987.096321] wb_workfn+0x388/0x5d4 > >> [ 3987.096335] process_scheduled_works+0x1c4/0x45c > >> [ 3987.096346] worker_thread+0x32c/0x3e8 > >> [ 3987.096356] kthread+0x11c/0x1b0 > >> [ 3987.096372] ret_from_fork+0x10/0x20 > >> > >> T1: T2: > >> blk_queue_enter > >> blk_pm_resume_queue > >> pm_request_resume > > > > Shouldn't this be pm_runtime_resume() rather? > > I'm not sure about that, I'll check if pm_runtime_resume() should be > used here instead. Well, the code as is now schedules an async resume of the device and then waits for it to complete. It would be more straightforward to resume the device synchronously IMV. > > > >> __pm_runtime_resume(dev, RPM_ASYNC) > >> rpm_resume __pm_runtime_disable > >> dev->power.request_pending = true dev->power.disable_depth++ > >> queue_work(pm_wq, &dev->power.work) __pm_runtime_barrier > >> wait_event cancel_work_sync(&dev->power.work) > >> > >> T1 queues the work item, which is then cancelled by T2 before it starts > >> execution. As a result, q->dev cannot be resumed, and T1 waits here for > >> a long time. > >> > >> Signed-off-by: Yang Yang <yang.yang@vivo.com> > >> --- > >> drivers/base/power/runtime.c | 3 ++- > >> include/linux/pm.h | 1 + > >> 2 files changed, 3 insertions(+), 1 deletion(-) > >> > >> diff --git a/drivers/base/power/runtime.c b/drivers/base/power/runtime.c > >> index 1b11a3cd4acc..fc9bf3fb3bb7 100644 > >> --- a/drivers/base/power/runtime.c > >> +++ b/drivers/base/power/runtime.c > >> @@ -1533,7 +1533,8 @@ void __pm_runtime_disable(struct device *dev, bool check_resume) > >> * means there probably is some I/O to process and disabling runtime PM > >> * shouldn't prevent the device from processing the I/O. > >> */ > >> - if (check_resume && dev->power.request_pending && > >> + if ((check_resume || dev->power.force_check_resume) && > >> + dev->power.request_pending && > >> dev->power.request == RPM_REQ_RESUME) { > >> /* > >> * Prevent suspends and idle notifications from being carried > > > > There are only two cases in which false is passed to > > __pm_runtime_disable(), one is in device_suspend_late() and I don't > > think that's relevant here, and the other is in pm_runtime_remove() > > and that's get called when the device is going away. > > > > So apparently, blk_pm_resume_queue() races with the device going away. > > Is this expected to happen even? > > > > If so, wouldn't it be better to modify pm_runtime_remove() to pass > > true to __pm_runtime_disable() instead of making these ad hoc changes? > > > > Sorry, I didn't make it clear in my previous message. > I can confirm that __pm_runtime_disable() is called from > device_suspend_late(), and this issue occurs during system suspend. Interesting because the runtime PM workqueue is frozen at this point, so waiting for a work item in it to complete is pointless. What the patch does is to declare that the device can be runtime-resumed in device_suspend_late(), but this is kind of a hack IMV as it potentially affects the device's parent etc. If the device cannot stay in runtime suspend across the entire system suspend transition, it should be resumed (synchronously) earlier, in device_suspend() or in device_prepare() even. ^ permalink raw reply [flat|nested] 44+ messages in thread
* Re: [PATCH 1/2] PM: runtime: Fix I/O hang due to race between resume and runtime disable 2025-11-26 12:36 ` Rafael J. Wysocki @ 2025-11-26 15:33 ` Bart Van Assche 2025-11-26 15:41 ` Rafael J. Wysocki 0 siblings, 1 reply; 44+ messages in thread From: Bart Van Assche @ 2025-11-26 15:33 UTC (permalink / raw) To: Rafael J. Wysocki, YangYang Cc: Jens Axboe, Pavel Machek, Len Brown, Greg Kroah-Hartman, Danilo Krummrich, linux-block, linux-kernel, linux-pm On 11/26/25 4:36 AM, Rafael J. Wysocki wrote: > Well, the code as is now schedules an async resume of the device and > then waits for it to complete. It would be more straightforward to > resume the device synchronously IMV. That would increase the depth of the call stack significantly. I'm not sure that's safe in this context. Thanks, Bart. ^ permalink raw reply [flat|nested] 44+ messages in thread
* Re: [PATCH 1/2] PM: runtime: Fix I/O hang due to race between resume and runtime disable 2025-11-26 15:33 ` Bart Van Assche @ 2025-11-26 15:41 ` Rafael J. Wysocki 2025-11-26 18:40 ` Bart Van Assche 0 siblings, 1 reply; 44+ messages in thread From: Rafael J. Wysocki @ 2025-11-26 15:41 UTC (permalink / raw) To: Bart Van Assche Cc: Rafael J. Wysocki, YangYang, Jens Axboe, Pavel Machek, Len Brown, Greg Kroah-Hartman, Danilo Krummrich, linux-block, linux-kernel, linux-pm On Wed, Nov 26, 2025 at 4:34 PM Bart Van Assche <bvanassche@acm.org> wrote: > > On 11/26/25 4:36 AM, Rafael J. Wysocki wrote: > > Well, the code as is now schedules an async resume of the device and > > then waits for it to complete. It would be more straightforward to > > resume the device synchronously IMV. > > That would increase the depth of the call stack significantly. I'm not > sure that's safe in this context. As it stands, you have a basic problem with respect to system suspend/hibernation. As I said before, the PM workqueue is frozen during system suspend/hibernation transitions, so waiting for an async resume request to complete then is pointless. ^ permalink raw reply [flat|nested] 44+ messages in thread
* Re: [PATCH 1/2] PM: runtime: Fix I/O hang due to race between resume and runtime disable 2025-11-26 15:41 ` Rafael J. Wysocki @ 2025-11-26 18:40 ` Bart Van Assche 2025-11-27 11:29 ` YangYang 0 siblings, 1 reply; 44+ messages in thread From: Bart Van Assche @ 2025-11-26 18:40 UTC (permalink / raw) To: Rafael J. Wysocki Cc: YangYang, Jens Axboe, Pavel Machek, Len Brown, Greg Kroah-Hartman, Danilo Krummrich, linux-block, linux-kernel, linux-pm On 11/26/25 7:41 AM, Rafael J. Wysocki wrote: > As it stands, you have a basic problem with respect to system > suspend/hibernation. As I said before, the PM workqueue is frozen > during system suspend/hibernation transitions, so waiting for an async > resume request to complete then is pointless. Agreed. I noticed that any attempt to call request_firmware() from driver system resume callback functions causes a deadlock if these calls happen before the block device has been resumed. Thanks, Bart. ^ permalink raw reply [flat|nested] 44+ messages in thread
* Re: [PATCH 1/2] PM: runtime: Fix I/O hang due to race between resume and runtime disable 2025-11-26 18:40 ` Bart Van Assche @ 2025-11-27 11:29 ` YangYang 2025-11-27 12:44 ` Rafael J. Wysocki 2025-12-01 16:40 ` Bart Van Assche 0 siblings, 2 replies; 44+ messages in thread From: YangYang @ 2025-11-27 11:29 UTC (permalink / raw) To: Bart Van Assche, Rafael J. Wysocki Cc: Jens Axboe, Pavel Machek, Len Brown, Greg Kroah-Hartman, Danilo Krummrich, linux-block, linux-kernel, linux-pm On 2025/11/27 2:40, Bart Van Assche wrote: > On 11/26/25 7:41 AM, Rafael J. Wysocki wrote: >> As it stands, you have a basic problem with respect to system >> suspend/hibernation. As I said before, the PM workqueue is frozen >> during system suspend/hibernation transitions, so waiting for an async >> resume request to complete then is pointless. > > Agreed. I noticed that any attempt to call request_firmware() from > driver system resume callback functions causes a deadlock if these > calls happen before the block device has been resumed. > > Thanks, > > Bart. Does this patch look reasonable to you? It hasn't been fully tested yet, but the resume is now performed synchronously. diff --git a/block/blk-core.c b/block/blk-core.c index 66fb2071d..041d29ba4 100644 --- a/block/blk-core.c +++ b/block/blk-core.c @@ -323,12 +323,15 @@ int blk_queue_enter(struct request_queue *q, blk_mq_req_flags_t flags) * reordered. */ smp_rmb(); - wait_event(q->mq_freeze_wq, - (!q->mq_freeze_depth && - blk_pm_resume_queue(pm, q)) || - blk_queue_dying(q)); +check: + wait_event(q->mq_freeze_wq, !q->mq_freeze_depth); + if (blk_queue_dying(q)) return -ENODEV; + if (!blk_pm_resume_queue(pm, q)) { + pm_runtime_resume(q->dev); + goto check; + } } rwsem_acquire_read(&q->q_lockdep_map, 0, 0, _RET_IP_); @@ -356,12 +359,15 @@ int __bio_queue_enter(struct request_queue *q, struct bio *bio) * reordered. */ smp_rmb(); - wait_event(q->mq_freeze_wq, - (!q->mq_freeze_depth && - blk_pm_resume_queue(false, q)) || - test_bit(GD_DEAD, &disk->state)); +check: + wait_event(q->mq_freeze_wq, !q->mq_freeze_depth); + if (test_bit(GD_DEAD, &disk->state)) goto dead; + if (!blk_pm_resume_queue(false, q)) { + pm_runtime_resume(q->dev); + goto check; + } } rwsem_acquire_read(&q->io_lockdep_map, 0, 0, _RET_IP_); diff --git a/block/blk-pm.h b/block/blk-pm.h index 8a5a0d4b3..c28fad105 100644 --- a/block/blk-pm.h +++ b/block/blk-pm.h @@ -12,7 +12,6 @@ static inline int blk_pm_resume_queue(const bool pm, struct request_queue *q) return 1; /* Nothing to do */ if (pm && q->rpm_status != RPM_SUSPENDED) return 1; /* Request allowed */ - pm_request_resume(q->dev); return 0; } ^ permalink raw reply related [flat|nested] 44+ messages in thread
* Re: [PATCH 1/2] PM: runtime: Fix I/O hang due to race between resume and runtime disable 2025-11-27 11:29 ` YangYang @ 2025-11-27 12:44 ` Rafael J. Wysocki 2025-11-28 7:20 ` YangYang 2025-12-01 16:40 ` Bart Van Assche 1 sibling, 1 reply; 44+ messages in thread From: Rafael J. Wysocki @ 2025-11-27 12:44 UTC (permalink / raw) To: YangYang Cc: Bart Van Assche, Rafael J. Wysocki, Jens Axboe, Pavel Machek, Len Brown, Greg Kroah-Hartman, Danilo Krummrich, linux-block, linux-kernel, linux-pm On Thu, Nov 27, 2025 at 12:29 PM YangYang <yang.yang@vivo.com> wrote: > > On 2025/11/27 2:40, Bart Van Assche wrote: > > On 11/26/25 7:41 AM, Rafael J. Wysocki wrote: > >> As it stands, you have a basic problem with respect to system > >> suspend/hibernation. As I said before, the PM workqueue is frozen > >> during system suspend/hibernation transitions, so waiting for an async > >> resume request to complete then is pointless. > > > > Agreed. I noticed that any attempt to call request_firmware() from > > driver system resume callback functions causes a deadlock if these > > calls happen before the block device has been resumed. > > > > Thanks, > > > > Bart. > > Does this patch look reasonable to you? It hasn't been fully tested > yet, but the resume is now performed synchronously. > > diff --git a/block/blk-core.c b/block/blk-core.c > index 66fb2071d..041d29ba4 100644 > --- a/block/blk-core.c > +++ b/block/blk-core.c > @@ -323,12 +323,15 @@ int blk_queue_enter(struct request_queue *q, > blk_mq_req_flags_t flags) > * reordered. > */ > smp_rmb(); > - wait_event(q->mq_freeze_wq, > - (!q->mq_freeze_depth && > - blk_pm_resume_queue(pm, q)) || > - blk_queue_dying(q)); > +check: > + wait_event(q->mq_freeze_wq, !q->mq_freeze_depth); I think that you still need to check blk_queue_dying(q) under wait_even() or you may not stop waiting when this happens. > + > if (blk_queue_dying(q)) > return -ENODEV; > + if (!blk_pm_resume_queue(pm, q)) { > + pm_runtime_resume(q->dev); > + goto check; > + } > } > > rwsem_acquire_read(&q->q_lockdep_map, 0, 0, _RET_IP_); > @@ -356,12 +359,15 @@ int __bio_queue_enter(struct request_queue *q, > struct bio *bio) > * reordered. > */ > smp_rmb(); > - wait_event(q->mq_freeze_wq, > - (!q->mq_freeze_depth && > - blk_pm_resume_queue(false, q)) || > - test_bit(GD_DEAD, &disk->state)); > +check: > + wait_event(q->mq_freeze_wq, !q->mq_freeze_depth); Analogously here, you may not stop waiting when test_bit(GD_DEAD, &disk->state) is true. > + > if (test_bit(GD_DEAD, &disk->state)) > goto dead; > + if (!blk_pm_resume_queue(false, q)) { > + pm_runtime_resume(q->dev); > + goto check; > + } > } > > rwsem_acquire_read(&q->io_lockdep_map, 0, 0, _RET_IP_); > diff --git a/block/blk-pm.h b/block/blk-pm.h > index 8a5a0d4b3..c28fad105 100644 > --- a/block/blk-pm.h > +++ b/block/blk-pm.h > @@ -12,7 +12,6 @@ static inline int blk_pm_resume_queue(const bool pm, > struct request_queue *q) > return 1; /* Nothing to do */ > if (pm && q->rpm_status != RPM_SUSPENDED) > return 1; /* Request allowed */ > - pm_request_resume(q->dev); > return 0; > } And I would rename blk_pm_resume_queue() to something like blk_pm_queue_active() because it is a bit confusing as it stands. Apart from the above remarks this makes sense to me FWIW. ^ permalink raw reply [flat|nested] 44+ messages in thread
* Re: [PATCH 1/2] PM: runtime: Fix I/O hang due to race between resume and runtime disable 2025-11-27 12:44 ` Rafael J. Wysocki @ 2025-11-28 7:20 ` YangYang 0 siblings, 0 replies; 44+ messages in thread From: YangYang @ 2025-11-28 7:20 UTC (permalink / raw) To: Rafael J. Wysocki Cc: Bart Van Assche, Jens Axboe, Pavel Machek, Len Brown, Greg Kroah-Hartman, Danilo Krummrich, linux-block, linux-kernel, linux-pm On 2025/11/27 20:44, Rafael J. Wysocki wrote: > On Thu, Nov 27, 2025 at 12:29 PM YangYang <yang.yang@vivo.com> wrote: >> >> On 2025/11/27 2:40, Bart Van Assche wrote: >>> On 11/26/25 7:41 AM, Rafael J. Wysocki wrote: >>>> As it stands, you have a basic problem with respect to system >>>> suspend/hibernation. As I said before, the PM workqueue is frozen >>>> during system suspend/hibernation transitions, so waiting for an async >>>> resume request to complete then is pointless. >>> >>> Agreed. I noticed that any attempt to call request_firmware() from >>> driver system resume callback functions causes a deadlock if these >>> calls happen before the block device has been resumed. >>> >>> Thanks, >>> >>> Bart. >> >> Does this patch look reasonable to you? It hasn't been fully tested >> yet, but the resume is now performed synchronously. >> >> diff --git a/block/blk-core.c b/block/blk-core.c >> index 66fb2071d..041d29ba4 100644 >> --- a/block/blk-core.c >> +++ b/block/blk-core.c >> @@ -323,12 +323,15 @@ int blk_queue_enter(struct request_queue *q, >> blk_mq_req_flags_t flags) >> * reordered. >> */ >> smp_rmb(); >> - wait_event(q->mq_freeze_wq, >> - (!q->mq_freeze_depth && >> - blk_pm_resume_queue(pm, q)) || >> - blk_queue_dying(q)); >> +check: >> + wait_event(q->mq_freeze_wq, !q->mq_freeze_depth); > > I think that you still need to check blk_queue_dying(q) under > wait_even() or you may not stop waiting when this happens. > Got it. >> + >> if (blk_queue_dying(q)) >> return -ENODEV; >> + if (!blk_pm_resume_queue(pm, q)) { >> + pm_runtime_resume(q->dev); >> + goto check; >> + } >> } >> >> rwsem_acquire_read(&q->q_lockdep_map, 0, 0, _RET_IP_); >> @@ -356,12 +359,15 @@ int __bio_queue_enter(struct request_queue *q, >> struct bio *bio) >> * reordered. >> */ >> smp_rmb(); >> - wait_event(q->mq_freeze_wq, >> - (!q->mq_freeze_depth && >> - blk_pm_resume_queue(false, q)) || >> - test_bit(GD_DEAD, &disk->state)); >> +check: >> + wait_event(q->mq_freeze_wq, !q->mq_freeze_depth); > > Analogously here, you may not stop waiting when test_bit(GD_DEAD, > &disk->state) is true. > Got it. >> + >> if (test_bit(GD_DEAD, &disk->state)) >> goto dead; >> + if (!blk_pm_resume_queue(false, q)) { >> + pm_runtime_resume(q->dev); >> + goto check; >> + } >> } >> >> rwsem_acquire_read(&q->io_lockdep_map, 0, 0, _RET_IP_); >> diff --git a/block/blk-pm.h b/block/blk-pm.h >> index 8a5a0d4b3..c28fad105 100644 >> --- a/block/blk-pm.h >> +++ b/block/blk-pm.h >> @@ -12,7 +12,6 @@ static inline int blk_pm_resume_queue(const bool pm, >> struct request_queue *q) >> return 1; /* Nothing to do */ >> if (pm && q->rpm_status != RPM_SUSPENDED) >> return 1; /* Request allowed */ >> - pm_request_resume(q->dev); >> return 0; >> } > > And I would rename blk_pm_resume_queue() to something like > blk_pm_queue_active() because it is a bit confusing as it stands. > > Apart from the above remarks this makes sense to me FWIW. Got it. I'll fix these in the next version and run some tests before sending it out. Thanks for the review. ^ permalink raw reply [flat|nested] 44+ messages in thread
* Re: [PATCH 1/2] PM: runtime: Fix I/O hang due to race between resume and runtime disable 2025-11-27 11:29 ` YangYang 2025-11-27 12:44 ` Rafael J. Wysocki @ 2025-12-01 16:40 ` Bart Van Assche 1 sibling, 0 replies; 44+ messages in thread From: Bart Van Assche @ 2025-12-01 16:40 UTC (permalink / raw) To: YangYang, Rafael J. Wysocki Cc: Jens Axboe, Pavel Machek, Len Brown, Greg Kroah-Hartman, Danilo Krummrich, linux-block, linux-kernel, linux-pm On 11/27/25 3:29 AM, YangYang wrote: > diff --git a/block/blk-core.c b/block/blk-core.c > index 66fb2071d..041d29ba4 100644 > --- a/block/blk-core.c > +++ b/block/blk-core.c > @@ -323,12 +323,15 @@ int blk_queue_enter(struct request_queue *q, > blk_mq_req_flags_t flags) > * reordered. > */ > smp_rmb(); > - wait_event(q->mq_freeze_wq, > - (!q->mq_freeze_depth && > - blk_pm_resume_queue(pm, q)) || > - blk_queue_dying(q)); > +check: > + wait_event(q->mq_freeze_wq, !q->mq_freeze_depth); > + > if (blk_queue_dying(q)) > return -ENODEV; This can't work. blk_mq_destroy_queue() freezes a request queue without unfreezing it so the above code will introduce a deadlock and/or a use- after-free if it executes concurrently with blk_mq_destroy_queue(). Bart. ^ permalink raw reply [flat|nested] 44+ messages in thread
* Re: [PATCH 1/2] PM: runtime: Fix I/O hang due to race between resume and runtime disable 2025-11-26 11:30 ` Rafael J. Wysocki 2025-11-26 11:59 ` YangYang @ 2025-11-26 18:06 ` Bart Van Assche 2025-11-26 19:16 ` Rafael J. Wysocki 1 sibling, 1 reply; 44+ messages in thread From: Bart Van Assche @ 2025-11-26 18:06 UTC (permalink / raw) To: Rafael J. Wysocki, Yang Yang Cc: Jens Axboe, Pavel Machek, Len Brown, Greg Kroah-Hartman, Danilo Krummrich, linux-block, linux-kernel, linux-pm On 11/26/25 3:30 AM, Rafael J. Wysocki wrote: > On Wed, Nov 26, 2025 at 11:17 AM Yang Yang <yang.yang@vivo.com> wrote: >> T1: T2: >> blk_queue_enter >> blk_pm_resume_queue >> pm_request_resume > > Shouldn't this be pm_runtime_resume() rather? I tried to make that change on an Android device. As a result, the kernel complaint shown below appeared. My understanding is that sleeping in atomic context can trigger a deadlock and hence is not allowed. [ 13.728890][ T1] WARNING: CPU: 6 PID: 1 at kernel/sched/core.c:9714 __might_sleep+0x78/0x84 [ 13.758800][ T1] Call trace: [ 13.759027][ T1] __might_sleep+0x78/0x84 [ 13.759340][ T1] __pm_runtime_resume+0x40/0xb8 [ 13.759781][ T1] __bio_queue_enter+0xc0/0x1cc [ 13.760153][ T1] blk_mq_submit_bio+0x884/0xadc [ 13.760548][ T1] __submit_bio+0x2c8/0x49c [ 13.760879][ T1] __submit_bio_noacct_mq+0x38/0x88 [ 13.761242][ T1] submit_bio_noacct_nocheck+0x4fc/0x7b8 [ 13.761631][ T1] submit_bio+0x214/0x4c0 [ 13.761941][ T1] mpage_readahead+0x1b8/0x1fc [ 13.762284][ T1] blkdev_readahead+0x18/0x28 [ 13.762660][ T1] page_cache_ra_unbounded+0x310/0x4d8 [ 13.763072][ T1] page_cache_ra_order+0xc0/0x5b0 [ 13.763434][ T1] page_cache_sync_ra+0x17c/0x268 [ 13.763782][ T1] filemap_read+0x4c4/0x12f4 [ 13.764125][ T1] blkdev_read_iter+0x100/0x164 [ 13.764475][ T1] vfs_read+0x188/0x348 [ 13.764789][ T1] __se_sys_pread64+0x84/0xc8 [ 13.765180][ T1] __arm64_sys_pread64+0x1c/0x2c [ 13.765556][ T1] invoke_syscall+0x58/0xf0 [ 13.765876][ T1] do_el0_svc+0x8c/0xe0 [ 13.766172][ T1] el0_svc+0x50/0xd4 [ 13.766583][ T1] el0t_64_sync_handler+0x20/0xf4 [ 13.766932][ T1] el0t_64_sync+0x1bc/0x1c0 [ 13.767294][ T1] irq event stamp: 2589614 [ 13.767592][ T1] hardirqs last enabled at (2589613): [<ffffffc0800eaf24>] finish_lock_switch+0x70/0x108 [ 13.768283][ T1] hardirqs last disabled at (2589614): [<ffffffc0814b66f4>] el1_dbg+0x24/0x80 [ 13.768875][ T1] softirqs last enabled at (2589370): [<ffffffc080082a7c>] ____do_softirq+0x10/0x20 [ 13.769529][ T1] softirqs last disabled at (2589349): [<ffffffc080082a7c>] ____do_softirq+0x10/0x20 I think that the filemap_invalidate_lock_shared() call in page_cache_ra_unbounded() forbids sleeping in submit_bio(). Thanks, Bart. ^ permalink raw reply [flat|nested] 44+ messages in thread
* Re: [PATCH 1/2] PM: runtime: Fix I/O hang due to race between resume and runtime disable 2025-11-26 18:06 ` Bart Van Assche @ 2025-11-26 19:16 ` Rafael J. Wysocki 2025-11-26 19:34 ` Rafael J. Wysocki 0 siblings, 1 reply; 44+ messages in thread From: Rafael J. Wysocki @ 2025-11-26 19:16 UTC (permalink / raw) To: Bart Van Assche Cc: Rafael J. Wysocki, Yang Yang, Jens Axboe, Pavel Machek, Len Brown, Greg Kroah-Hartman, Danilo Krummrich, linux-block, linux-kernel, linux-pm On Wed, Nov 26, 2025 at 7:06 PM Bart Van Assche <bvanassche@acm.org> wrote: > > On 11/26/25 3:30 AM, Rafael J. Wysocki wrote: > > On Wed, Nov 26, 2025 at 11:17 AM Yang Yang <yang.yang@vivo.com> wrote: > >> T1: T2: > >> blk_queue_enter > >> blk_pm_resume_queue > >> pm_request_resume > > > > Shouldn't this be pm_runtime_resume() rather? > > I tried to make that change on an Android device. As a result, the > kernel complaint shown below appeared. My understanding is that sleeping > in atomic context can trigger a deadlock and hence is not allowed. > > [ 13.728890][ T1] WARNING: CPU: 6 PID: 1 at > kernel/sched/core.c:9714 __might_sleep+0x78/0x84 > [ 13.758800][ T1] Call trace: > [ 13.759027][ T1] __might_sleep+0x78/0x84 > [ 13.759340][ T1] __pm_runtime_resume+0x40/0xb8 > [ 13.759781][ T1] __bio_queue_enter+0xc0/0x1cc > [ 13.760153][ T1] blk_mq_submit_bio+0x884/0xadc > [ 13.760548][ T1] __submit_bio+0x2c8/0x49c > [ 13.760879][ T1] __submit_bio_noacct_mq+0x38/0x88 > [ 13.761242][ T1] submit_bio_noacct_nocheck+0x4fc/0x7b8 > [ 13.761631][ T1] submit_bio+0x214/0x4c0 > [ 13.761941][ T1] mpage_readahead+0x1b8/0x1fc > [ 13.762284][ T1] blkdev_readahead+0x18/0x28 > [ 13.762660][ T1] page_cache_ra_unbounded+0x310/0x4d8 > [ 13.763072][ T1] page_cache_ra_order+0xc0/0x5b0 > [ 13.763434][ T1] page_cache_sync_ra+0x17c/0x268 > [ 13.763782][ T1] filemap_read+0x4c4/0x12f4 > [ 13.764125][ T1] blkdev_read_iter+0x100/0x164 > [ 13.764475][ T1] vfs_read+0x188/0x348 > [ 13.764789][ T1] __se_sys_pread64+0x84/0xc8 > [ 13.765180][ T1] __arm64_sys_pread64+0x1c/0x2c > [ 13.765556][ T1] invoke_syscall+0x58/0xf0 > [ 13.765876][ T1] do_el0_svc+0x8c/0xe0 > [ 13.766172][ T1] el0_svc+0x50/0xd4 > [ 13.766583][ T1] el0t_64_sync_handler+0x20/0xf4 > [ 13.766932][ T1] el0t_64_sync+0x1bc/0x1c0 > [ 13.767294][ T1] irq event stamp: 2589614 > [ 13.767592][ T1] hardirqs last enabled at (2589613): > [<ffffffc0800eaf24>] finish_lock_switch+0x70/0x108 > [ 13.768283][ T1] hardirqs last disabled at (2589614): > [<ffffffc0814b66f4>] el1_dbg+0x24/0x80 > [ 13.768875][ T1] softirqs last enabled at (2589370): > [<ffffffc080082a7c>] ____do_softirq+0x10/0x20 > [ 13.769529][ T1] softirqs last disabled at (2589349): > [<ffffffc080082a7c>] ____do_softirq+0x10/0x20 > > I think that the filemap_invalidate_lock_shared() call in > page_cache_ra_unbounded() forbids sleeping in submit_bio(). The wait_event() macro in __bio_queue_enter() calls might_sleep() at the very beginning, so why would it not complain? IIUC, this is the WARN_ONCE() in __might_sleep() about the task state being different from TASK_RUNNING, which triggers because prepare_to_wait_event() changes the task state to TASK_UNINTERRUPTIBLE. This means that calling pm_runtime_resume() cannot be part of the wait_event() condition, so blk_pm_resume_queue() and the wait_event() macros involving it would need some rewriting. ^ permalink raw reply [flat|nested] 44+ messages in thread
* Re: [PATCH 1/2] PM: runtime: Fix I/O hang due to race between resume and runtime disable 2025-11-26 19:16 ` Rafael J. Wysocki @ 2025-11-26 19:34 ` Rafael J. Wysocki 2025-11-26 20:17 ` Rafael J. Wysocki 0 siblings, 1 reply; 44+ messages in thread From: Rafael J. Wysocki @ 2025-11-26 19:34 UTC (permalink / raw) To: Bart Van Assche Cc: Yang Yang, Jens Axboe, Pavel Machek, Len Brown, Greg Kroah-Hartman, Danilo Krummrich, linux-block, linux-kernel, linux-pm On Wed, Nov 26, 2025 at 8:16 PM Rafael J. Wysocki <rafael@kernel.org> wrote: > > On Wed, Nov 26, 2025 at 7:06 PM Bart Van Assche <bvanassche@acm.org> wrote: > > > > On 11/26/25 3:30 AM, Rafael J. Wysocki wrote: > > > On Wed, Nov 26, 2025 at 11:17 AM Yang Yang <yang.yang@vivo.com> wrote: > > >> T1: T2: > > >> blk_queue_enter > > >> blk_pm_resume_queue > > >> pm_request_resume > > > > > > Shouldn't this be pm_runtime_resume() rather? > > > > I tried to make that change on an Android device. As a result, the > > kernel complaint shown below appeared. My understanding is that sleeping > > in atomic context can trigger a deadlock and hence is not allowed. > > > > [ 13.728890][ T1] WARNING: CPU: 6 PID: 1 at > > kernel/sched/core.c:9714 __might_sleep+0x78/0x84 > > [ 13.758800][ T1] Call trace: > > [ 13.759027][ T1] __might_sleep+0x78/0x84 > > [ 13.759340][ T1] __pm_runtime_resume+0x40/0xb8 > > [ 13.759781][ T1] __bio_queue_enter+0xc0/0x1cc > > [ 13.760153][ T1] blk_mq_submit_bio+0x884/0xadc > > [ 13.760548][ T1] __submit_bio+0x2c8/0x49c > > [ 13.760879][ T1] __submit_bio_noacct_mq+0x38/0x88 > > [ 13.761242][ T1] submit_bio_noacct_nocheck+0x4fc/0x7b8 > > [ 13.761631][ T1] submit_bio+0x214/0x4c0 > > [ 13.761941][ T1] mpage_readahead+0x1b8/0x1fc > > [ 13.762284][ T1] blkdev_readahead+0x18/0x28 > > [ 13.762660][ T1] page_cache_ra_unbounded+0x310/0x4d8 > > [ 13.763072][ T1] page_cache_ra_order+0xc0/0x5b0 > > [ 13.763434][ T1] page_cache_sync_ra+0x17c/0x268 > > [ 13.763782][ T1] filemap_read+0x4c4/0x12f4 > > [ 13.764125][ T1] blkdev_read_iter+0x100/0x164 > > [ 13.764475][ T1] vfs_read+0x188/0x348 > > [ 13.764789][ T1] __se_sys_pread64+0x84/0xc8 > > [ 13.765180][ T1] __arm64_sys_pread64+0x1c/0x2c > > [ 13.765556][ T1] invoke_syscall+0x58/0xf0 > > [ 13.765876][ T1] do_el0_svc+0x8c/0xe0 > > [ 13.766172][ T1] el0_svc+0x50/0xd4 > > [ 13.766583][ T1] el0t_64_sync_handler+0x20/0xf4 > > [ 13.766932][ T1] el0t_64_sync+0x1bc/0x1c0 > > [ 13.767294][ T1] irq event stamp: 2589614 > > [ 13.767592][ T1] hardirqs last enabled at (2589613): > > [<ffffffc0800eaf24>] finish_lock_switch+0x70/0x108 > > [ 13.768283][ T1] hardirqs last disabled at (2589614): > > [<ffffffc0814b66f4>] el1_dbg+0x24/0x80 > > [ 13.768875][ T1] softirqs last enabled at (2589370): > > [<ffffffc080082a7c>] ____do_softirq+0x10/0x20 > > [ 13.769529][ T1] softirqs last disabled at (2589349): > > [<ffffffc080082a7c>] ____do_softirq+0x10/0x20 > > > > I think that the filemap_invalidate_lock_shared() call in > > page_cache_ra_unbounded() forbids sleeping in submit_bio(). > > The wait_event() macro in __bio_queue_enter() calls might_sleep() at > the very beginning, so why would it not complain? > > IIUC, this is the WARN_ONCE() in __might_sleep() about the task state > being different from TASK_RUNNING, which triggers because > prepare_to_wait_event() changes the task state to > TASK_UNINTERRUPTIBLE. > > This means that calling pm_runtime_resume() cannot be part of the > wait_event() condition, so blk_pm_resume_queue() and the wait_event() > macros involving it would need some rewriting. Interestingly enough, the pm_request_resume() call in blk_pm_resume_queue() is not even necessary in the __bio_queue_enter() case because pm is false there and it doesn't even check q->rpm_status. So in fact the resume is only necessary in blk_queue_enter() if pm is nonzero. ^ permalink raw reply [flat|nested] 44+ messages in thread
* Re: [PATCH 1/2] PM: runtime: Fix I/O hang due to race between resume and runtime disable 2025-11-26 19:34 ` Rafael J. Wysocki @ 2025-11-26 20:17 ` Rafael J. Wysocki 2025-11-26 21:10 ` Bart Van Assche 0 siblings, 1 reply; 44+ messages in thread From: Rafael J. Wysocki @ 2025-11-26 20:17 UTC (permalink / raw) To: Bart Van Assche Cc: Yang Yang, Jens Axboe, Pavel Machek, Len Brown, Greg Kroah-Hartman, Danilo Krummrich, linux-block, linux-kernel, linux-pm On Wednesday, November 26, 2025 8:34:54 PM CET Rafael J. Wysocki wrote: > On Wed, Nov 26, 2025 at 8:16 PM Rafael J. Wysocki <rafael@kernel.org> wrote: > > > > On Wed, Nov 26, 2025 at 7:06 PM Bart Van Assche <bvanassche@acm.org> wrote: > > > > > > On 11/26/25 3:30 AM, Rafael J. Wysocki wrote: > > > > On Wed, Nov 26, 2025 at 11:17 AM Yang Yang <yang.yang@vivo.com> wrote: > > > >> T1: T2: > > > >> blk_queue_enter > > > >> blk_pm_resume_queue > > > >> pm_request_resume > > > > > > > > Shouldn't this be pm_runtime_resume() rather? > > > > > > I tried to make that change on an Android device. As a result, the > > > kernel complaint shown below appeared. My understanding is that sleeping > > > in atomic context can trigger a deadlock and hence is not allowed. > > > > > > [ 13.728890][ T1] WARNING: CPU: 6 PID: 1 at > > > kernel/sched/core.c:9714 __might_sleep+0x78/0x84 > > > [ 13.758800][ T1] Call trace: > > > [ 13.759027][ T1] __might_sleep+0x78/0x84 > > > [ 13.759340][ T1] __pm_runtime_resume+0x40/0xb8 > > > [ 13.759781][ T1] __bio_queue_enter+0xc0/0x1cc > > > [ 13.760153][ T1] blk_mq_submit_bio+0x884/0xadc > > > [ 13.760548][ T1] __submit_bio+0x2c8/0x49c > > > [ 13.760879][ T1] __submit_bio_noacct_mq+0x38/0x88 > > > [ 13.761242][ T1] submit_bio_noacct_nocheck+0x4fc/0x7b8 > > > [ 13.761631][ T1] submit_bio+0x214/0x4c0 > > > [ 13.761941][ T1] mpage_readahead+0x1b8/0x1fc > > > [ 13.762284][ T1] blkdev_readahead+0x18/0x28 > > > [ 13.762660][ T1] page_cache_ra_unbounded+0x310/0x4d8 > > > [ 13.763072][ T1] page_cache_ra_order+0xc0/0x5b0 > > > [ 13.763434][ T1] page_cache_sync_ra+0x17c/0x268 > > > [ 13.763782][ T1] filemap_read+0x4c4/0x12f4 > > > [ 13.764125][ T1] blkdev_read_iter+0x100/0x164 > > > [ 13.764475][ T1] vfs_read+0x188/0x348 > > > [ 13.764789][ T1] __se_sys_pread64+0x84/0xc8 > > > [ 13.765180][ T1] __arm64_sys_pread64+0x1c/0x2c > > > [ 13.765556][ T1] invoke_syscall+0x58/0xf0 > > > [ 13.765876][ T1] do_el0_svc+0x8c/0xe0 > > > [ 13.766172][ T1] el0_svc+0x50/0xd4 > > > [ 13.766583][ T1] el0t_64_sync_handler+0x20/0xf4 > > > [ 13.766932][ T1] el0t_64_sync+0x1bc/0x1c0 > > > [ 13.767294][ T1] irq event stamp: 2589614 > > > [ 13.767592][ T1] hardirqs last enabled at (2589613): > > > [<ffffffc0800eaf24>] finish_lock_switch+0x70/0x108 > > > [ 13.768283][ T1] hardirqs last disabled at (2589614): > > > [<ffffffc0814b66f4>] el1_dbg+0x24/0x80 > > > [ 13.768875][ T1] softirqs last enabled at (2589370): > > > [<ffffffc080082a7c>] ____do_softirq+0x10/0x20 > > > [ 13.769529][ T1] softirqs last disabled at (2589349): > > > [<ffffffc080082a7c>] ____do_softirq+0x10/0x20 > > > > > > I think that the filemap_invalidate_lock_shared() call in > > > page_cache_ra_unbounded() forbids sleeping in submit_bio(). > > > > The wait_event() macro in __bio_queue_enter() calls might_sleep() at > > the very beginning, so why would it not complain? > > > > IIUC, this is the WARN_ONCE() in __might_sleep() about the task state > > being different from TASK_RUNNING, which triggers because > > prepare_to_wait_event() changes the task state to > > TASK_UNINTERRUPTIBLE. > > > > This means that calling pm_runtime_resume() cannot be part of the > > wait_event() condition, so blk_pm_resume_queue() and the wait_event() > > macros involving it would need some rewriting. > > Interestingly enough, the pm_request_resume() call in > blk_pm_resume_queue() is not even necessary in the __bio_queue_enter() > case because pm is false there and it doesn't even check > q->rpm_status. > > So in fact the resume is only necessary in blk_queue_enter() if pm is nonzero. If I'm not completely in the weeds, something like the patch below should be doable. Also, I'd consider using pm_runtime_get_noresume() and pm_runtime_put_noidle() in blk_queue_enter() and blk_queue_exit(), respectively, in the "pm != 0" case to prevent the device from suspending while the .q_usage_counter ref is held. --- block/blk-core.c | 6 +++--- block/blk-pm.h | 7 ++++--- 2 files changed, 7 insertions(+), 6 deletions(-) --- a/block/blk-core.c +++ b/block/blk-core.c @@ -309,6 +309,8 @@ int blk_queue_enter(struct request_queue if (flags & BLK_MQ_REQ_NOWAIT) return -EAGAIN; + /* if necessary, resume .dev (assume success). */ + blk_pm_resume_queue(pm, q); /* * read pair of barrier in blk_freeze_queue_start(), we need to * order reading __PERCPU_REF_DEAD flag of .q_usage_counter and @@ -318,9 +320,7 @@ int blk_queue_enter(struct request_queue */ smp_rmb(); wait_event(q->mq_freeze_wq, - (!q->mq_freeze_depth && - blk_pm_resume_queue(pm, q)) || - blk_queue_dying(q)); + !q->mq_freeze_depth || blk_queue_dying(q)); if (blk_queue_dying(q)) return -ENODEV; } --- a/block/blk-pm.h +++ b/block/blk-pm.h @@ -10,9 +10,10 @@ static inline int blk_pm_resume_queue(co { if (!q->dev || !blk_queue_pm_only(q)) return 1; /* Nothing to do */ - if (pm && q->rpm_status != RPM_SUSPENDED) - return 1; /* Request allowed */ - pm_request_resume(q->dev); + + if (pm) + pm_runtime_resume(q->dev); + return 0; } ^ permalink raw reply [flat|nested] 44+ messages in thread
* Re: [PATCH 1/2] PM: runtime: Fix I/O hang due to race between resume and runtime disable 2025-11-26 20:17 ` Rafael J. Wysocki @ 2025-11-26 21:10 ` Bart Van Assche 2025-11-26 21:30 ` Rafael J. Wysocki 0 siblings, 1 reply; 44+ messages in thread From: Bart Van Assche @ 2025-11-26 21:10 UTC (permalink / raw) To: Rafael J. Wysocki Cc: Yang Yang, Jens Axboe, Pavel Machek, Len Brown, Greg Kroah-Hartman, Danilo Krummrich, linux-block, linux-kernel, linux-pm On 11/26/25 12:17 PM, Rafael J. Wysocki wrote: > --- a/block/blk-core.c > +++ b/block/blk-core.c > @@ -309,6 +309,8 @@ int blk_queue_enter(struct request_queue > if (flags & BLK_MQ_REQ_NOWAIT) > return -EAGAIN; > > + /* if necessary, resume .dev (assume success). */ > + blk_pm_resume_queue(pm, q); > /* > * read pair of barrier in blk_freeze_queue_start(), we need to > * order reading __PERCPU_REF_DEAD flag of .q_usage_counter and blk_queue_enter() may be called from the suspend path so I don't think that the above change will work. As an example, the UFS driver submits a SCSI START STOP UNIT command from its runtime suspend callback. The call chain is as follows: ufshcd_wl_runtime_suspend() __ufshcd_wl_suspend() ufshcd_set_dev_pwr_mode() ufshcd_execute_start_stop() scsi_execute_cmd() scsi_alloc_request() blk_queue_enter() blk_execute_rq() blk_mq_free_request() blk_queue_exit() Thanks, Bart. ^ permalink raw reply [flat|nested] 44+ messages in thread
* Re: [PATCH 1/2] PM: runtime: Fix I/O hang due to race between resume and runtime disable 2025-11-26 21:10 ` Bart Van Assche @ 2025-11-26 21:30 ` Rafael J. Wysocki 2025-11-26 22:47 ` Bart Van Assche 0 siblings, 1 reply; 44+ messages in thread From: Rafael J. Wysocki @ 2025-11-26 21:30 UTC (permalink / raw) To: Bart Van Assche Cc: Rafael J. Wysocki, Yang Yang, Jens Axboe, Pavel Machek, Len Brown, Greg Kroah-Hartman, Danilo Krummrich, linux-block, linux-kernel, linux-pm On Wed, Nov 26, 2025 at 10:11 PM Bart Van Assche <bvanassche@acm.org> wrote: > > On 11/26/25 12:17 PM, Rafael J. Wysocki wrote: > > --- a/block/blk-core.c > > +++ b/block/blk-core.c > > @@ -309,6 +309,8 @@ int blk_queue_enter(struct request_queue > > if (flags & BLK_MQ_REQ_NOWAIT) > > return -EAGAIN; > > > > + /* if necessary, resume .dev (assume success). */ > > + blk_pm_resume_queue(pm, q); > > /* > > * read pair of barrier in blk_freeze_queue_start(), we need to > > * order reading __PERCPU_REF_DEAD flag of .q_usage_counter and > > blk_queue_enter() may be called from the suspend path so I don't think > that the above change will work. Why would the existing code work then? Are you suggesting that q->rpm_status should still be checked before calling pm_runtime_resume() or do you mean something else? > As an example, the UFS driver submits a > SCSI START STOP UNIT command from its runtime suspend callback. The call > chain is as follows: > > ufshcd_wl_runtime_suspend() > __ufshcd_wl_suspend() > ufshcd_set_dev_pwr_mode() > ufshcd_execute_start_stop() > scsi_execute_cmd() > scsi_alloc_request() > blk_queue_enter() > blk_execute_rq() > blk_mq_free_request() > blk_queue_exit() In any case, calling pm_request_resume() from blk_pm_resume_queue() in the !pm case is a mistake. ^ permalink raw reply [flat|nested] 44+ messages in thread
* Re: [PATCH 1/2] PM: runtime: Fix I/O hang due to race between resume and runtime disable 2025-11-26 21:30 ` Rafael J. Wysocki @ 2025-11-26 22:47 ` Bart Van Assche 2025-11-27 12:34 ` Rafael J. Wysocki 0 siblings, 1 reply; 44+ messages in thread From: Bart Van Assche @ 2025-11-26 22:47 UTC (permalink / raw) To: Rafael J. Wysocki Cc: Yang Yang, Jens Axboe, Pavel Machek, Len Brown, Greg Kroah-Hartman, Danilo Krummrich, linux-block, linux-kernel, linux-pm On 11/26/25 1:30 PM, Rafael J. Wysocki wrote: > On Wed, Nov 26, 2025 at 10:11 PM Bart Van Assche <bvanassche@acm.org> wrote: >> >> On 11/26/25 12:17 PM, Rafael J. Wysocki wrote: >>> --- a/block/blk-core.c >>> +++ b/block/blk-core.c >>> @@ -309,6 +309,8 @@ int blk_queue_enter(struct request_queue >>> if (flags & BLK_MQ_REQ_NOWAIT) >>> return -EAGAIN; >>> >>> + /* if necessary, resume .dev (assume success). */ >>> + blk_pm_resume_queue(pm, q); >>> /* >>> * read pair of barrier in blk_freeze_queue_start(), we need to >>> * order reading __PERCPU_REF_DEAD flag of .q_usage_counter and >> >> blk_queue_enter() may be called from the suspend path so I don't think >> that the above change will work. > > Why would the existing code work then? The existing code works reliably on a very large number of devices. Maybe there is a misunderstanding? RQF_PM / BLK_MQ_REQ_PM are set for requests that should be processed even if the power status is changing (RPM_SUSPENDING or RPM_RESUMING). The meaning of the 'pm' variable is as follows: process this request even if a power state change is ongoing. > Are you suggesting that q->rpm_status should still be checked before > calling pm_runtime_resume() or do you mean something else? The purpose of the code changes from a previous email is not entirely clear to me so I'm not sure what the code should look like. But to answer your question, calling blk_pm_resume_queue() if the runtime status is RPM_SUSPENDED should be safe. >> As an example, the UFS driver submits a >> SCSI START STOP UNIT command from its runtime suspend callback. The call >> chain is as follows: >> >> ufshcd_wl_runtime_suspend() >> __ufshcd_wl_suspend() >> ufshcd_set_dev_pwr_mode() >> ufshcd_execute_start_stop() >> scsi_execute_cmd() >> scsi_alloc_request() >> blk_queue_enter() >> blk_execute_rq() >> blk_mq_free_request() >> blk_queue_exit() > > In any case, calling pm_request_resume() from blk_pm_resume_queue() in > the !pm case is a mistake. Hmm ... we may disagree about this. Does what I wrote above make clear why blk_pm_resume_queue() is called if pm == false? Thanks, Bart. ^ permalink raw reply [flat|nested] 44+ messages in thread
* Re: [PATCH 1/2] PM: runtime: Fix I/O hang due to race between resume and runtime disable 2025-11-26 22:47 ` Bart Van Assche @ 2025-11-27 12:34 ` Rafael J. Wysocki 2025-12-01 9:46 ` YangYang 0 siblings, 1 reply; 44+ messages in thread From: Rafael J. Wysocki @ 2025-11-27 12:34 UTC (permalink / raw) To: Bart Van Assche Cc: Rafael J. Wysocki, Yang Yang, Jens Axboe, Pavel Machek, Len Brown, Greg Kroah-Hartman, Danilo Krummrich, linux-block, linux-kernel, linux-pm On Wed, Nov 26, 2025 at 11:47 PM Bart Van Assche <bvanassche@acm.org> wrote: > > On 11/26/25 1:30 PM, Rafael J. Wysocki wrote: > > On Wed, Nov 26, 2025 at 10:11 PM Bart Van Assche <bvanassche@acm.org> wrote: > >> > >> On 11/26/25 12:17 PM, Rafael J. Wysocki wrote: > >>> --- a/block/blk-core.c > >>> +++ b/block/blk-core.c > >>> @@ -309,6 +309,8 @@ int blk_queue_enter(struct request_queue > >>> if (flags & BLK_MQ_REQ_NOWAIT) > >>> return -EAGAIN; > >>> > >>> + /* if necessary, resume .dev (assume success). */ > >>> + blk_pm_resume_queue(pm, q); > >>> /* > >>> * read pair of barrier in blk_freeze_queue_start(), we need to > >>> * order reading __PERCPU_REF_DEAD flag of .q_usage_counter and > >> > >> blk_queue_enter() may be called from the suspend path so I don't think > >> that the above change will work. > > > > Why would the existing code work then? > > The existing code works reliably on a very large number of devices. Well, except that it doesn't work during system suspend and hibernation when the PM workqueue is frozen. I think that we agree here. This needs to be addressed because it may very well cause system suspend to deadlock. There are two possible ways to address it I can think of: 1. Changing blk_pm_resume_queue() and its users to carry out a synchronous resume of q->dev instead of calling pm_request_resume() and (effectively) waiting for the queued-up runtime resume of q->dev to take effect. This would be my preferred option, but at this point I'm not sure if it's viable. 2. Stop freezing the PM workqueue before system suspend/hibernation and adapt device_suspend_late() to that. This should be doable, even though it is a bit risky because it may uncover some latent bugs (the freezing of the PM workqueue has been there forever), but it wouldn't address the problem entirely because device_suspend_late() would still need to disable runtime PM for the device (and for some devices it is disabled earlier), so pm_request_resume() would just start to fail at that point and if blk_queue_enter() were called after that point for a device supporting runtime PM, it might deadlock. > Maybe there is a misunderstanding? RQF_PM / BLK_MQ_REQ_PM are set for > requests that should be processed even if the power status is changing > (RPM_SUSPENDING or RPM_RESUMING). The meaning of the 'pm' variable is > as follows: process this request even if a power state change is > ongoing. I see. The behavior depends on whether or not q->pm_only is set. If it is not set, both blk_queue_enter() and __bio_queue_enter() will allow the request to be processed. If q->pm_only is set, __bio_queue_enter() will wait until it gets cleared and in that case pm_request_resume(q->dev) is called to make that happen (did I get it right?). This is a bit fragile because what if the async resume of q->dev fails for some reason? You deadlock instead of failing the request. Unlike __bio_queue_enter(), blk_queue_enter() additionally checks the runtime PM status of the queue if q->pm_only is set and it will allow the request to be processed in that case so long as q->rpm_status is not RPM_SUSPENDED. However, if the queue status is RPM_SUSPENDED, pm_request_resume(q->dev) will be called like in the __bio_queue_enter() case. I'm not sure why pm_request_resume(q->dev) needs to be called from within blk_pm_resume_queue(). Arguably, it should be sufficient to call it once before using the wait_event() macro, if the conditions checked by blk_pm_resume_queue() are not met. > > Are you suggesting that q->rpm_status should still be checked before > > calling pm_runtime_resume() or do you mean something else? > The purpose of the code changes from a previous email is not entirely > clear to me so I'm not sure what the code should look like. But to > answer your question, calling blk_pm_resume_queue() if the runtime > status is RPM_SUSPENDED should be safe. > >> As an example, the UFS driver submits a > >> SCSI START STOP UNIT command from its runtime suspend callback. The call > >> chain is as follows: > >> > >> ufshcd_wl_runtime_suspend() > >> __ufshcd_wl_suspend() > >> ufshcd_set_dev_pwr_mode() > >> ufshcd_execute_start_stop() > >> scsi_execute_cmd() > >> scsi_alloc_request() > >> blk_queue_enter() > >> blk_execute_rq() > >> blk_mq_free_request() > >> blk_queue_exit() > > > > In any case, calling pm_request_resume() from blk_pm_resume_queue() in > > the !pm case is a mistake. > Hmm ... we may disagree about this. Does what I wrote above make clear > why blk_pm_resume_queue() is called if pm == false? Yes, it does, thanks! ^ permalink raw reply [flat|nested] 44+ messages in thread
* Re: [PATCH 1/2] PM: runtime: Fix I/O hang due to race between resume and runtime disable 2025-11-27 12:34 ` Rafael J. Wysocki @ 2025-12-01 9:46 ` YangYang 2025-12-01 12:56 ` YangYang 2025-12-01 18:47 ` Rafael J. Wysocki 0 siblings, 2 replies; 44+ messages in thread From: YangYang @ 2025-12-01 9:46 UTC (permalink / raw) To: Rafael J. Wysocki, Bart Van Assche Cc: Jens Axboe, Pavel Machek, Len Brown, Greg Kroah-Hartman, Danilo Krummrich, linux-block, linux-kernel, linux-pm On 2025/11/27 20:34, Rafael J. Wysocki wrote: > On Wed, Nov 26, 2025 at 11:47 PM Bart Van Assche <bvanassche@acm.org> wrote: >> >> On 11/26/25 1:30 PM, Rafael J. Wysocki wrote: >>> On Wed, Nov 26, 2025 at 10:11 PM Bart Van Assche <bvanassche@acm.org> wrote: >>>> >>>> On 11/26/25 12:17 PM, Rafael J. Wysocki wrote: >>>>> --- a/block/blk-core.c >>>>> +++ b/block/blk-core.c >>>>> @@ -309,6 +309,8 @@ int blk_queue_enter(struct request_queue >>>>> if (flags & BLK_MQ_REQ_NOWAIT) >>>>> return -EAGAIN; >>>>> >>>>> + /* if necessary, resume .dev (assume success). */ >>>>> + blk_pm_resume_queue(pm, q); >>>>> /* >>>>> * read pair of barrier in blk_freeze_queue_start(), we need to >>>>> * order reading __PERCPU_REF_DEAD flag of .q_usage_counter and >>>> >>>> blk_queue_enter() may be called from the suspend path so I don't think >>>> that the above change will work. >>> >>> Why would the existing code work then? >> >> The existing code works reliably on a very large number of devices. > > Well, except that it doesn't work during system suspend and > hibernation when the PM workqueue is frozen. I think that we agree > here. > > This needs to be addressed because it may very well cause system > suspend to deadlock. > > There are two possible ways to address it I can think of: > > 1. Changing blk_pm_resume_queue() and its users to carry out a > synchronous resume of q->dev instead of calling pm_request_resume() > and (effectively) waiting for the queued-up runtime resume of q->dev > to take effect. > > This would be my preferred option, but at this point I'm not sure if > it's viable. > After __pm_runtime_disable() is called from device_suspend_late(), dev->power.disable_depth is set, preventing rpm_resume() from making progress until the system resume completes, regardless of whether rpm_resume() is invoked synchronously or asynchronously. Performing a synchronous resume of q->dev seems to have a similar effect to removing the following code block from __pm_runtime_barrier(), which is invoked by __pm_runtime_disable(): 1428 if (dev->power.request_pending) { 1429 dev->power.request = RPM_REQ_NONE; 1430 spin_unlock_irq(&dev->power.lock); 1431 1432 cancel_work_sync(&dev->power.work); 1433 1434 spin_lock_irq(&dev->power.lock); 1435 dev->power.request_pending = false; 1436 } > 2. Stop freezing the PM workqueue before system suspend/hibernation > and adapt device_suspend_late() to that. > > This should be doable, even though it is a bit risky because it may > uncover some latent bugs (the freezing of the PM workqueue has been > there forever), but it wouldn't address the problem entirely because > device_suspend_late() would still need to disable runtime PM for the > device (and for some devices it is disabled earlier), so > pm_request_resume() would just start to fail at that point and if > blk_queue_enter() were called after that point for a device supporting > runtime PM, it might deadlock. > >> Maybe there is a misunderstanding? RQF_PM / BLK_MQ_REQ_PM are set for >> requests that should be processed even if the power status is changing >> (RPM_SUSPENDING or RPM_RESUMING). The meaning of the 'pm' variable is >> as follows: process this request even if a power state change is >> ongoing. > > I see. > > The behavior depends on whether or not q->pm_only is set. If it is > not set, both blk_queue_enter() and __bio_queue_enter() will allow the > request to be processed. > > If q->pm_only is set, __bio_queue_enter() will wait until it gets > cleared and in that case pm_request_resume(q->dev) is called to make > that happen (did I get it right?). This is a bit fragile because what > if the async resume of q->dev fails for some reason? You deadlock > instead of failing the request. > > Unlike __bio_queue_enter(), blk_queue_enter() additionally checks the > runtime PM status of the queue if q->pm_only is set and it will allow > the request to be processed in that case so long as q->rpm_status is > not RPM_SUSPENDED. However, if the queue status is RPM_SUSPENDED, > pm_request_resume(q->dev) will be called like in the > __bio_queue_enter() case. > > I'm not sure why pm_request_resume(q->dev) needs to be called from > within blk_pm_resume_queue(). Arguably, it should be sufficient to > call it once before using the wait_event() macro, if the conditions > checked by blk_pm_resume_queue() are not met. > >>> Are you suggesting that q->rpm_status should still be checked before >>> calling pm_runtime_resume() or do you mean something else? >> The purpose of the code changes from a previous email is not entirely >> clear to me so I'm not sure what the code should look like. But to >> answer your question, calling blk_pm_resume_queue() if the runtime >> status is RPM_SUSPENDED should be safe. >>>> As an example, the UFS driver submits a >>>> SCSI START STOP UNIT command from its runtime suspend callback. The call >>>> chain is as follows: >>>> >>>> ufshcd_wl_runtime_suspend() >>>> __ufshcd_wl_suspend() >>>> ufshcd_set_dev_pwr_mode() >>>> ufshcd_execute_start_stop() >>>> scsi_execute_cmd() >>>> scsi_alloc_request() >>>> blk_queue_enter() >>>> blk_execute_rq() >>>> blk_mq_free_request() >>>> blk_queue_exit() >>> >>> In any case, calling pm_request_resume() from blk_pm_resume_queue() in >>> the !pm case is a mistake. >> Hmm ... we may disagree about this. Does what I wrote above make clear >> why blk_pm_resume_queue() is called if pm == false? > > Yes, it does, thanks! ^ permalink raw reply [flat|nested] 44+ messages in thread
* Re: [PATCH 1/2] PM: runtime: Fix I/O hang due to race between resume and runtime disable 2025-12-01 9:46 ` YangYang @ 2025-12-01 12:56 ` YangYang 2025-12-01 18:55 ` Rafael J. Wysocki 2025-12-01 18:47 ` Rafael J. Wysocki 1 sibling, 1 reply; 44+ messages in thread From: YangYang @ 2025-12-01 12:56 UTC (permalink / raw) To: Rafael J. Wysocki, Bart Van Assche Cc: Jens Axboe, Pavel Machek, Len Brown, Greg Kroah-Hartman, Danilo Krummrich, linux-block, linux-kernel, linux-pm On 2025/12/1 17:46, YangYang wrote: > On 2025/11/27 20:34, Rafael J. Wysocki wrote: >> On Wed, Nov 26, 2025 at 11:47 PM Bart Van Assche <bvanassche@acm.org> wrote: >>> >>> On 11/26/25 1:30 PM, Rafael J. Wysocki wrote: >>>> On Wed, Nov 26, 2025 at 10:11 PM Bart Van Assche <bvanassche@acm.org> wrote: >>>>> >>>>> On 11/26/25 12:17 PM, Rafael J. Wysocki wrote: >>>>>> --- a/block/blk-core.c >>>>>> +++ b/block/blk-core.c >>>>>> @@ -309,6 +309,8 @@ int blk_queue_enter(struct request_queue >>>>>> if (flags & BLK_MQ_REQ_NOWAIT) >>>>>> return -EAGAIN; >>>>>> >>>>>> + /* if necessary, resume .dev (assume success). */ >>>>>> + blk_pm_resume_queue(pm, q); >>>>>> /* >>>>>> * read pair of barrier in blk_freeze_queue_start(), we need to >>>>>> * order reading __PERCPU_REF_DEAD flag of .q_usage_counter and >>>>> >>>>> blk_queue_enter() may be called from the suspend path so I don't think >>>>> that the above change will work. >>>> >>>> Why would the existing code work then? >>> >>> The existing code works reliably on a very large number of devices. >> >> Well, except that it doesn't work during system suspend and >> hibernation when the PM workqueue is frozen. I think that we agree >> here. >> >> This needs to be addressed because it may very well cause system >> suspend to deadlock. >> >> There are two possible ways to address it I can think of: >> >> 1. Changing blk_pm_resume_queue() and its users to carry out a >> synchronous resume of q->dev instead of calling pm_request_resume() >> and (effectively) waiting for the queued-up runtime resume of q->dev >> to take effect. >> >> This would be my preferred option, but at this point I'm not sure if >> it's viable. >> > > After __pm_runtime_disable() is called from device_suspend_late(), dev->power.disable_depth is set, preventing > rpm_resume() from making progress until the system resume completes, regardless of whether rpm_resume() is invoked > synchronously or asynchronously. > Performing a synchronous resume of q->dev seems to have a similar effect to removing the following code block from > __pm_runtime_barrier(), which is invoked by __pm_runtime_disable(): > > 1428 if (dev->power.request_pending) { > 1429 dev->power.request = RPM_REQ_NONE; > 1430 spin_unlock_irq(&dev->power.lock); > 1431 > 1432 cancel_work_sync(&dev->power.work); > 1433 > 1434 spin_lock_irq(&dev->power.lock); > 1435 dev->power.request_pending = false; > 1436 } > Since both synchronous and asynchronous resumes face similar issues, it may be sufficient to keep using the asynchronous resume path as long as pending work items are not canceled while the PM workqueue is frozen. This allows the pending work to proceed normally once the PM workqueue is unfrozen. --- drivers/base/power/main.c | 2 +- drivers/base/power/runtime.c | 17 +++++++++++------ include/linux/pm_runtime.h | 6 +++--- 3 files changed, 15 insertions(+), 10 deletions(-) diff --git a/drivers/base/power/main.c b/drivers/base/power/main.c index 1de1cd72b616..d5c3d7a6777e 100644 --- a/drivers/base/power/main.c +++ b/drivers/base/power/main.c @@ -1635,7 +1635,7 @@ static void device_suspend_late(struct device *dev, pm_message_t state, bool asy * Disable runtime PM for the device without checking if there is a * pending resume request for it. */ - __pm_runtime_disable(dev, false); + __pm_runtime_disable(dev, false, true); if (dev->power.syscore) goto Skip; diff --git a/drivers/base/power/runtime.c b/drivers/base/power/runtime.c index 1b11a3cd4acc..ff3fdfba2dc8 100644 --- a/drivers/base/power/runtime.c +++ b/drivers/base/power/runtime.c @@ -1421,11 +1421,16 @@ EXPORT_SYMBOL_GPL(__pm_runtime_set_status); * * Should be called under dev->power.lock with interrupts disabled. */ -static void __pm_runtime_barrier(struct device *dev) +static void __pm_runtime_barrier(struct device *dev, bool frozen) { pm_runtime_deactivate_timer(dev); - if (dev->power.request_pending) { + /* + * If the PM workqueue has already been frozen, the following + * operations are unnecessary. This allows any pending work to + * continue execution once the PM workqueue is unfrozen. + */ + if (!frozen && dev->power.request_pending) { dev->power.request = RPM_REQ_NONE; spin_unlock_irq(&dev->power.lock); @@ -1485,7 +1490,7 @@ int pm_runtime_barrier(struct device *dev) retval = 1; } - __pm_runtime_barrier(dev); + __pm_runtime_barrier(dev, false); spin_unlock_irq(&dev->power.lock); pm_runtime_put_noidle(dev); @@ -1519,7 +1524,7 @@ void pm_runtime_unblock(struct device *dev) spin_unlock_irq(&dev->power.lock); } -void __pm_runtime_disable(struct device *dev, bool check_resume) +void __pm_runtime_disable(struct device *dev, bool check_resume, bool frozen) { spin_lock_irq(&dev->power.lock); @@ -1550,7 +1555,7 @@ void __pm_runtime_disable(struct device *dev, bool check_resume) update_pm_runtime_accounting(dev); if (!dev->power.disable_depth++) { - __pm_runtime_barrier(dev); + __pm_runtime_barrier(dev, frozen); dev->power.last_status = dev->power.runtime_status; } @@ -1893,7 +1898,7 @@ void pm_runtime_reinit(struct device *dev) */ void pm_runtime_remove(struct device *dev) { - __pm_runtime_disable(dev, false); + __pm_runtime_disable(dev, false, false); pm_runtime_reinit(dev); } diff --git a/include/linux/pm_runtime.h b/include/linux/pm_runtime.h index 0b436e15f4cd..102060a9ebc7 100644 --- a/include/linux/pm_runtime.h +++ b/include/linux/pm_runtime.h @@ -80,7 +80,7 @@ extern int pm_runtime_barrier(struct device *dev); extern bool pm_runtime_block_if_disabled(struct device *dev); extern void pm_runtime_unblock(struct device *dev); extern void pm_runtime_enable(struct device *dev); -extern void __pm_runtime_disable(struct device *dev, bool check_resume); +extern void __pm_runtime_disable(struct device *dev, bool check_resume, bool frozen); extern void pm_runtime_allow(struct device *dev); extern void pm_runtime_forbid(struct device *dev); extern void pm_runtime_no_callbacks(struct device *dev); @@ -288,7 +288,7 @@ static inline int pm_runtime_barrier(struct device *dev) { return 0; } static inline bool pm_runtime_block_if_disabled(struct device *dev) { return true; } static inline void pm_runtime_unblock(struct device *dev) {} static inline void pm_runtime_enable(struct device *dev) {} -static inline void __pm_runtime_disable(struct device *dev, bool c) {} +static inline void __pm_runtime_disable(struct device *dev, bool c, bool f) {} static inline bool pm_runtime_blocked(struct device *dev) { return true; } static inline void pm_runtime_allow(struct device *dev) {} static inline void pm_runtime_forbid(struct device *dev) {} @@ -775,7 +775,7 @@ static inline int pm_runtime_set_suspended(struct device *dev) */ static inline void pm_runtime_disable(struct device *dev) { - __pm_runtime_disable(dev, true); + __pm_runtime_disable(dev, true, false); } /** -- ^ permalink raw reply related [flat|nested] 44+ messages in thread
* Re: [PATCH 1/2] PM: runtime: Fix I/O hang due to race between resume and runtime disable 2025-12-01 12:56 ` YangYang @ 2025-12-01 18:55 ` Rafael J. Wysocki 2025-12-02 10:33 ` YangYang 0 siblings, 1 reply; 44+ messages in thread From: Rafael J. Wysocki @ 2025-12-01 18:55 UTC (permalink / raw) To: YangYang Cc: Rafael J. Wysocki, Bart Van Assche, Jens Axboe, Pavel Machek, Len Brown, Greg Kroah-Hartman, Danilo Krummrich, linux-block, linux-kernel, linux-pm On Mon, Dec 1, 2025 at 1:56 PM YangYang <yang.yang@vivo.com> wrote: > > On 2025/12/1 17:46, YangYang wrote: > > On 2025/11/27 20:34, Rafael J. Wysocki wrote: > >> On Wed, Nov 26, 2025 at 11:47 PM Bart Van Assche <bvanassche@acm.org> wrote: > >>> > >>> On 11/26/25 1:30 PM, Rafael J. Wysocki wrote: > >>>> On Wed, Nov 26, 2025 at 10:11 PM Bart Van Assche <bvanassche@acm.org> wrote: > >>>>> > >>>>> On 11/26/25 12:17 PM, Rafael J. Wysocki wrote: > >>>>>> --- a/block/blk-core.c > >>>>>> +++ b/block/blk-core.c > >>>>>> @@ -309,6 +309,8 @@ int blk_queue_enter(struct request_queue > >>>>>> if (flags & BLK_MQ_REQ_NOWAIT) > >>>>>> return -EAGAIN; > >>>>>> > >>>>>> + /* if necessary, resume .dev (assume success). */ > >>>>>> + blk_pm_resume_queue(pm, q); > >>>>>> /* > >>>>>> * read pair of barrier in blk_freeze_queue_start(), we need to > >>>>>> * order reading __PERCPU_REF_DEAD flag of .q_usage_counter and > >>>>> > >>>>> blk_queue_enter() may be called from the suspend path so I don't think > >>>>> that the above change will work. > >>>> > >>>> Why would the existing code work then? > >>> > >>> The existing code works reliably on a very large number of devices. > >> > >> Well, except that it doesn't work during system suspend and > >> hibernation when the PM workqueue is frozen. I think that we agree > >> here. > >> > >> This needs to be addressed because it may very well cause system > >> suspend to deadlock. > >> > >> There are two possible ways to address it I can think of: > >> > >> 1. Changing blk_pm_resume_queue() and its users to carry out a > >> synchronous resume of q->dev instead of calling pm_request_resume() > >> and (effectively) waiting for the queued-up runtime resume of q->dev > >> to take effect. > >> > >> This would be my preferred option, but at this point I'm not sure if > >> it's viable. > >> > > > > After __pm_runtime_disable() is called from device_suspend_late(), dev->power.disable_depth is set, preventing > > rpm_resume() from making progress until the system resume completes, regardless of whether rpm_resume() is invoked > > synchronously or asynchronously. > > Performing a synchronous resume of q->dev seems to have a similar effect to removing the following code block from > > __pm_runtime_barrier(), which is invoked by __pm_runtime_disable(): > > > > 1428 if (dev->power.request_pending) { > > 1429 dev->power.request = RPM_REQ_NONE; > > 1430 spin_unlock_irq(&dev->power.lock); > > 1431 > > 1432 cancel_work_sync(&dev->power.work); > > 1433 > > 1434 spin_lock_irq(&dev->power.lock); > > 1435 dev->power.request_pending = false; > > 1436 } > > > > Since both synchronous and asynchronous resumes face similar issues, No, they don't. > it may be sufficient to keep using the asynchronous resume path as long as > pending work items are not canceled while the PM workqueue is frozen. Except for two things: 1. If blk_queue_enter() or __bio_queue_enter() is allowed to race with disabling runtime PM, queuing up the resume work item may fail in the first place. 2. If a device runtime resume work item is queued up before the whole system is suspended, it may not make sense to run that work item after resuming the whole system because the state of the system as a whole is generally different at that point. > This allows the pending work to proceed normally once the PM workqueue > is unfrozen. Not really. ^ permalink raw reply [flat|nested] 44+ messages in thread
* Re: [PATCH 1/2] PM: runtime: Fix I/O hang due to race between resume and runtime disable 2025-12-01 18:55 ` Rafael J. Wysocki @ 2025-12-02 10:33 ` YangYang 2025-12-02 12:18 ` Rafael J. Wysocki 0 siblings, 1 reply; 44+ messages in thread From: YangYang @ 2025-12-02 10:33 UTC (permalink / raw) To: Rafael J. Wysocki Cc: Bart Van Assche, Jens Axboe, Pavel Machek, Len Brown, Greg Kroah-Hartman, Danilo Krummrich, linux-block, linux-kernel, linux-pm On 2025/12/2 2:55, Rafael J. Wysocki wrote: > On Mon, Dec 1, 2025 at 1:56 PM YangYang <yang.yang@vivo.com> wrote: >> >> On 2025/12/1 17:46, YangYang wrote: >>> On 2025/11/27 20:34, Rafael J. Wysocki wrote: >>>> On Wed, Nov 26, 2025 at 11:47 PM Bart Van Assche <bvanassche@acm.org> wrote: >>>>> >>>>> On 11/26/25 1:30 PM, Rafael J. Wysocki wrote: >>>>>> On Wed, Nov 26, 2025 at 10:11 PM Bart Van Assche <bvanassche@acm.org> wrote: >>>>>>> >>>>>>> On 11/26/25 12:17 PM, Rafael J. Wysocki wrote: >>>>>>>> --- a/block/blk-core.c >>>>>>>> +++ b/block/blk-core.c >>>>>>>> @@ -309,6 +309,8 @@ int blk_queue_enter(struct request_queue >>>>>>>> if (flags & BLK_MQ_REQ_NOWAIT) >>>>>>>> return -EAGAIN; >>>>>>>> >>>>>>>> + /* if necessary, resume .dev (assume success). */ >>>>>>>> + blk_pm_resume_queue(pm, q); >>>>>>>> /* >>>>>>>> * read pair of barrier in blk_freeze_queue_start(), we need to >>>>>>>> * order reading __PERCPU_REF_DEAD flag of .q_usage_counter and >>>>>>> >>>>>>> blk_queue_enter() may be called from the suspend path so I don't think >>>>>>> that the above change will work. >>>>>> >>>>>> Why would the existing code work then? >>>>> >>>>> The existing code works reliably on a very large number of devices. >>>> >>>> Well, except that it doesn't work during system suspend and >>>> hibernation when the PM workqueue is frozen. I think that we agree >>>> here. >>>> >>>> This needs to be addressed because it may very well cause system >>>> suspend to deadlock. >>>> >>>> There are two possible ways to address it I can think of: >>>> >>>> 1. Changing blk_pm_resume_queue() and its users to carry out a >>>> synchronous resume of q->dev instead of calling pm_request_resume() >>>> and (effectively) waiting for the queued-up runtime resume of q->dev >>>> to take effect. >>>> >>>> This would be my preferred option, but at this point I'm not sure if >>>> it's viable. >>>> >>> >>> After __pm_runtime_disable() is called from device_suspend_late(), dev->power.disable_depth is set, preventing >>> rpm_resume() from making progress until the system resume completes, regardless of whether rpm_resume() is invoked >>> synchronously or asynchronously. >>> Performing a synchronous resume of q->dev seems to have a similar effect to removing the following code block from >>> __pm_runtime_barrier(), which is invoked by __pm_runtime_disable(): >>> >>> 1428 if (dev->power.request_pending) { >>> 1429 dev->power.request = RPM_REQ_NONE; >>> 1430 spin_unlock_irq(&dev->power.lock); >>> 1431 >>> 1432 cancel_work_sync(&dev->power.work); >>> 1433 >>> 1434 spin_lock_irq(&dev->power.lock); >>> 1435 dev->power.request_pending = false; >>> 1436 } >>> >> >> Since both synchronous and asynchronous resumes face similar issues, > > No, they don't. > >> it may be sufficient to keep using the asynchronous resume path as long as >> pending work items are not canceled while the PM workqueue is frozen. > > Except for two things: > > 1. If blk_queue_enter() or __bio_queue_enter() is allowed to race with > disabling runtime PM, queuing up the resume work item may fail in the > first place. > Perhaps my understanding is incorrect, but during the execution of device_suspend_late(), the PM workqueue should already be frozen. In that case, queuing a resume work item would not fail; it would simply not be executed until the workqueue is unfrozen, as long as it is not canceled. > 2. If a device runtime resume work item is queued up before the whole > system is suspended, it may not make sense to run that work item after > resuming the whole system because the state of the system as a whole > is generally different at that point. > >> This allows the pending work to proceed normally once the PM workqueue >> is unfrozen. > > Not really. ^ permalink raw reply [flat|nested] 44+ messages in thread
* Re: [PATCH 1/2] PM: runtime: Fix I/O hang due to race between resume and runtime disable 2025-12-02 10:33 ` YangYang @ 2025-12-02 12:18 ` Rafael J. Wysocki 0 siblings, 0 replies; 44+ messages in thread From: Rafael J. Wysocki @ 2025-12-02 12:18 UTC (permalink / raw) To: YangYang Cc: Rafael J. Wysocki, Bart Van Assche, Jens Axboe, Pavel Machek, Len Brown, Greg Kroah-Hartman, Danilo Krummrich, linux-block, linux-kernel, linux-pm On Tue, Dec 2, 2025 at 11:33 AM YangYang <yang.yang@vivo.com> wrote: > > On 2025/12/2 2:55, Rafael J. Wysocki wrote: > > On Mon, Dec 1, 2025 at 1:56 PM YangYang <yang.yang@vivo.com> wrote: > >> > >> On 2025/12/1 17:46, YangYang wrote: > >>> On 2025/11/27 20:34, Rafael J. Wysocki wrote: > >>>> On Wed, Nov 26, 2025 at 11:47 PM Bart Van Assche <bvanassche@acm.org> wrote: > >>>>> > >>>>> On 11/26/25 1:30 PM, Rafael J. Wysocki wrote: > >>>>>> On Wed, Nov 26, 2025 at 10:11 PM Bart Van Assche <bvanassche@acm.org> wrote: > >>>>>>> > >>>>>>> On 11/26/25 12:17 PM, Rafael J. Wysocki wrote: > >>>>>>>> --- a/block/blk-core.c > >>>>>>>> +++ b/block/blk-core.c > >>>>>>>> @@ -309,6 +309,8 @@ int blk_queue_enter(struct request_queue > >>>>>>>> if (flags & BLK_MQ_REQ_NOWAIT) > >>>>>>>> return -EAGAIN; > >>>>>>>> > >>>>>>>> + /* if necessary, resume .dev (assume success). */ > >>>>>>>> + blk_pm_resume_queue(pm, q); > >>>>>>>> /* > >>>>>>>> * read pair of barrier in blk_freeze_queue_start(), we need to > >>>>>>>> * order reading __PERCPU_REF_DEAD flag of .q_usage_counter and > >>>>>>> > >>>>>>> blk_queue_enter() may be called from the suspend path so I don't think > >>>>>>> that the above change will work. > >>>>>> > >>>>>> Why would the existing code work then? > >>>>> > >>>>> The existing code works reliably on a very large number of devices. > >>>> > >>>> Well, except that it doesn't work during system suspend and > >>>> hibernation when the PM workqueue is frozen. I think that we agree > >>>> here. > >>>> > >>>> This needs to be addressed because it may very well cause system > >>>> suspend to deadlock. > >>>> > >>>> There are two possible ways to address it I can think of: > >>>> > >>>> 1. Changing blk_pm_resume_queue() and its users to carry out a > >>>> synchronous resume of q->dev instead of calling pm_request_resume() > >>>> and (effectively) waiting for the queued-up runtime resume of q->dev > >>>> to take effect. > >>>> > >>>> This would be my preferred option, but at this point I'm not sure if > >>>> it's viable. > >>>> > >>> > >>> After __pm_runtime_disable() is called from device_suspend_late(), dev->power.disable_depth is set, preventing > >>> rpm_resume() from making progress until the system resume completes, regardless of whether rpm_resume() is invoked > >>> synchronously or asynchronously. > >>> Performing a synchronous resume of q->dev seems to have a similar effect to removing the following code block from > >>> __pm_runtime_barrier(), which is invoked by __pm_runtime_disable(): > >>> > >>> 1428 if (dev->power.request_pending) { > >>> 1429 dev->power.request = RPM_REQ_NONE; > >>> 1430 spin_unlock_irq(&dev->power.lock); > >>> 1431 > >>> 1432 cancel_work_sync(&dev->power.work); > >>> 1433 > >>> 1434 spin_lock_irq(&dev->power.lock); > >>> 1435 dev->power.request_pending = false; > >>> 1436 } > >>> > >> > >> Since both synchronous and asynchronous resumes face similar issues, > > > > No, they don't. > > > >> it may be sufficient to keep using the asynchronous resume path as long as > >> pending work items are not canceled while the PM workqueue is frozen. > > > > Except for two things: > > > > 1. If blk_queue_enter() or __bio_queue_enter() is allowed to race with > > disabling runtime PM, queuing up the resume work item may fail in the > > first place. > > > > Perhaps my understanding is incorrect, but during the execution of > device_suspend_late(), the PM workqueue should already be frozen. > In that case, queuing a resume work item would not fail; it would > simply not be executed until the workqueue is unfrozen, as long as > it is not canceled. rpm_resume() returns an error if runtime PM is disabled for the given device and the device status is RPM_SUSPENDED even if it is called with RPM_ASYNC or RPM_NOWAIT in the flags. ^ permalink raw reply [flat|nested] 44+ messages in thread
* Re: [PATCH 1/2] PM: runtime: Fix I/O hang due to race between resume and runtime disable 2025-12-01 9:46 ` YangYang 2025-12-01 12:56 ` YangYang @ 2025-12-01 18:47 ` Rafael J. Wysocki 2025-12-01 19:58 ` [PATCH v1] PM: sleep: Do not flag runtime PM workqueue as freezable Rafael J. Wysocki ` (2 more replies) 1 sibling, 3 replies; 44+ messages in thread From: Rafael J. Wysocki @ 2025-12-01 18:47 UTC (permalink / raw) To: YangYang Cc: Rafael J. Wysocki, Bart Van Assche, Jens Axboe, Pavel Machek, Len Brown, Greg Kroah-Hartman, Danilo Krummrich, linux-block, linux-kernel, linux-pm On Mon, Dec 1, 2025 at 10:46 AM YangYang <yang.yang@vivo.com> wrote: > > On 2025/11/27 20:34, Rafael J. Wysocki wrote: > > On Wed, Nov 26, 2025 at 11:47 PM Bart Van Assche <bvanassche@acm.org> wrote: > >> > >> On 11/26/25 1:30 PM, Rafael J. Wysocki wrote: > >>> On Wed, Nov 26, 2025 at 10:11 PM Bart Van Assche <bvanassche@acm.org> wrote: > >>>> > >>>> On 11/26/25 12:17 PM, Rafael J. Wysocki wrote: > >>>>> --- a/block/blk-core.c > >>>>> +++ b/block/blk-core.c > >>>>> @@ -309,6 +309,8 @@ int blk_queue_enter(struct request_queue > >>>>> if (flags & BLK_MQ_REQ_NOWAIT) > >>>>> return -EAGAIN; > >>>>> > >>>>> + /* if necessary, resume .dev (assume success). */ > >>>>> + blk_pm_resume_queue(pm, q); > >>>>> /* > >>>>> * read pair of barrier in blk_freeze_queue_start(), we need to > >>>>> * order reading __PERCPU_REF_DEAD flag of .q_usage_counter and > >>>> > >>>> blk_queue_enter() may be called from the suspend path so I don't think > >>>> that the above change will work. > >>> > >>> Why would the existing code work then? > >> > >> The existing code works reliably on a very large number of devices. > > > > Well, except that it doesn't work during system suspend and > > hibernation when the PM workqueue is frozen. I think that we agree > > here. > > > > This needs to be addressed because it may very well cause system > > suspend to deadlock. > > > > There are two possible ways to address it I can think of: > > > > 1. Changing blk_pm_resume_queue() and its users to carry out a > > synchronous resume of q->dev instead of calling pm_request_resume() > > and (effectively) waiting for the queued-up runtime resume of q->dev > > to take effect. > > > > This would be my preferred option, but at this point I'm not sure if > > it's viable. > > > > After __pm_runtime_disable() is called from device_suspend_late(), > dev->power.disable_depth is set, preventing rpm_resume() from making > progress until the system resume completes, regardless of whether > rpm_resume() is invoked synchronously or asynchronously. This isn't factually correct. rpm_resume() will make progress when runtime PM is disabled, but it will not resume the target device. That's what disabling runtime PM means. Of course, when runtime PM is disabled for the given device, rpm_resume() will return an error code that can be checked. However, if pm_request_resume() is called before disabling runtime PM for the device and runtime PM is disabled for it before the work item queued by pm_request_resume() runs, the failure will be silent from the caller's perspective. > Performing a synchronous resume of q->dev seems to have a similar > effect to removing the following code block from > __pm_runtime_barrier(), which is invoked by __pm_runtime_disable(): > > 1428 if (dev->power.request_pending) { > 1429 dev->power.request = RPM_REQ_NONE; > 1430 spin_unlock_irq(&dev->power.lock); > 1431 > 1432 cancel_work_sync(&dev->power.work); > 1433 > 1434 spin_lock_irq(&dev->power.lock); > 1435 dev->power.request_pending = false; > 1436 } It is different. First of all, synchronous runtime resume is not affected by the freezing of the runtime PM workqueue. Next, see the remark above regarding returning an error code. Finally, so long as __pm_runtime_resume() acquires power.lock before __pm_runtime_disable(), the synchronous resume will be waited for by the latter. Generally speaking, if blk_queue_enter() or __bio_queue_enter() may run in parallel with device_suspend_late() for q->dev, the driver of that device is defective, because it is responsible for preventing this situation from happening. The most straightforward way to achieve that is to provide a .suspend() callback for q->dev that will runtime-resume it (and, of course, q->dev will need to be prepared for system suspend as appropriate after that). If blk_queue_enter() or __bio_queue_enter() is allowed to race with disabling runtime PM for q->dev, failure to resume q->dev is alway possible and there are no changes that can be made to pm_runtime_disable() to prevent that from happening. If __pm_runtime_disable() wins the race, it will increment power.disable_depth and rpm_resume() will bail out when it sees that no matter what. You should not conflate "runtime PM doesn't work when it is disabled" with "asynchronous runtime PM doesn't work after freezing the PM workqueue". They are both true, but they are not the same. ^ permalink raw reply [flat|nested] 44+ messages in thread
* [PATCH v1] PM: sleep: Do not flag runtime PM workqueue as freezable 2025-12-01 18:47 ` Rafael J. Wysocki @ 2025-12-01 19:58 ` Rafael J. Wysocki 2025-12-02 1:06 ` Bart Van Assche ` (2 more replies) 2025-12-02 0:40 ` [PATCH 1/2] PM: runtime: Fix I/O hang due to race between resume and runtime disable Bart Van Assche 2025-12-05 15:24 ` [PATCH v2] PM: sleep: Do not flag runtime PM workqueue as freezable Rafael J. Wysocki 2 siblings, 3 replies; 44+ messages in thread From: Rafael J. Wysocki @ 2025-12-01 19:58 UTC (permalink / raw) To: YangYang Cc: Bart Van Assche, Jens Axboe, Greg Kroah-Hartman, Danilo Krummrich, linux-block, linux-kernel, linux-pm, Ulf Hansson On Monday, December 1, 2025 7:47:46 PM CET Rafael J. Wysocki wrote: > On Mon, Dec 1, 2025 at 10:46 AM YangYang <yang.yang@vivo.com> wrote: [cut] > If blk_queue_enter() or __bio_queue_enter() is allowed to race with > disabling runtime PM for q->dev, failure to resume q->dev is alway > possible and there are no changes that can be made to > pm_runtime_disable() to prevent that from happening. If > __pm_runtime_disable() wins the race, it will increment > power.disable_depth and rpm_resume() will bail out when it sees that > no matter what. > > You should not conflate "runtime PM doesn't work when it is disabled" > with "asynchronous runtime PM doesn't work after freezing the PM > workqueue". They are both true, but they are not the same. So I've been testing the patch below for a few days and it will eliminate the latter, but even after this patch runtime PM will be disabled in device_suspend_late() and if the problem you are facing is still there after this patch, it will need to dealt with at the driver level. Generally speaking, driver involvement is needed to make runtime PM and system suspend/resume work together in the majority of cases. --- From: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Subject: Till now, the runtime PM workqueue has been flagged as freezable, so it does not process work items during system-wide PM transitions like system suspend and resume. The original reason to do that was to reduce the likelihood of runtime PM getting in the way of system-wide PM processing, but now it is mostly an optimization because (1) runtime suspend of devices is prevented by bumping up their runtime PM usage counters in device_prepare() and (2) device drivers are expected to disable runtime PM for the devices handled by them before they embark on system-wide PM activities that may change the state of the hardware or otherwise interfere with runtime PM. However, it prevents asynchronous runtime resume of devices from working during system-wide PM transitions, which is confusing because synchronous runtime resume is not prevented at the same time, and it also sometimes turns out to be problematic. For example, it has been reported that blk_queue_enter() may deadlock during a system suspend transition because of the pm_request_resume() usage in it [1]. That happens because the asynchronous runtime resume of the given device is not processed due to the freezing of the runtime PM workqueue. While it may be better to address this particular issue in the block layer, the very presence of it means that similar problems may be expected to occur elsewhere. For this reason, remove the WQ_FREEZABLE flag from the runtime PM workqueue and make device_suspend_late() use the generic variant of pm_runtime_disable() that will carry out runtime PM of the device synchronously if there is pending resume work for it. Also update the comment before the pm_runtime_disable() call in device_suspend_late() to document the fact that the runtime PM should not be expected to work for the device until the end of device_resume_early(). This change may, even though it is not expected to, uncover some latent issues related to queuing up asynchronous runtime resume work items during system suspend or hibernation. However, they should be limited to the interference between runtime resume and system-wide PM callbacks in the cases when device drivers start to handle system-wide PM before disabling runtime PM as described above. Link: https://lore.kernel.org/linux-pm/20251126101636.205505-2-yang.yang@vivo.com/ Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> --- drivers/base/power/main.c | 7 ++++--- kernel/power/main.c | 2 +- 2 files changed, 5 insertions(+), 4 deletions(-) --- a/drivers/base/power/main.c +++ b/drivers/base/power/main.c @@ -1647,10 +1647,11 @@ static void device_suspend_late(struct d goto Complete; /* - * Disable runtime PM for the device without checking if there is a - * pending resume request for it. + * After this point, any runtime PM operations targeting the device + * will fail until the corresponding pm_runtime_enable() call in + * device_resume_early(). */ - __pm_runtime_disable(dev, false); + pm_runtime_disable(dev); if (dev->power.syscore) goto Skip; --- a/kernel/power/main.c +++ b/kernel/power/main.c @@ -1125,7 +1125,7 @@ EXPORT_SYMBOL_GPL(pm_wq); static int __init pm_start_workqueues(void) { - pm_wq = alloc_workqueue("pm", WQ_FREEZABLE | WQ_UNBOUND, 0); + pm_wq = alloc_workqueue("pm", WQ_UNBOUND, 0); if (!pm_wq) return -ENOMEM; ^ permalink raw reply [flat|nested] 44+ messages in thread
* Re: [PATCH v1] PM: sleep: Do not flag runtime PM workqueue as freezable 2025-12-01 19:58 ` [PATCH v1] PM: sleep: Do not flag runtime PM workqueue as freezable Rafael J. Wysocki @ 2025-12-02 1:06 ` Bart Van Assche 2025-12-02 11:53 ` Rafael J. Wysocki 2025-12-02 10:36 ` YangYang 2025-12-02 14:58 ` Ulf Hansson 2 siblings, 1 reply; 44+ messages in thread From: Bart Van Assche @ 2025-12-02 1:06 UTC (permalink / raw) To: Rafael J. Wysocki, YangYang Cc: Jens Axboe, Greg Kroah-Hartman, Danilo Krummrich, linux-block, linux-kernel, linux-pm, Ulf Hansson On 12/1/25 11:58 AM, Rafael J. Wysocki wrote: > So I've been testing the patch below for a few days and it will eliminate > the latter, but even after this patch runtime PM will be disabled in > device_suspend_late() and if the problem you are facing is still there > after this patch, it will need to dealt with at the driver level. > > Generally speaking, driver involvement is needed to make runtime PM and > system suspend/resume work together in the majority of cases. Thank you for having developed and shared this patch. Is the following quote from the Linux kernel documentation still correct with this patch applied or should an update for Documentation/power/runtime_pm.rst perhaps be included in this patch? "The power management workqueue pm_wq in which bus types and device drivers can put their PM-related work items. It is strongly recommended that pm_wq be used for queuing all work items related to runtime PM, because this allows them to be synchronized with system-wide power transitions (suspend to RAM, hibernation and resume from system sleep states). pm_wq is declared in include/linux/pm_runtime.h and defined in kernel/power/main.c." Bart. ^ permalink raw reply [flat|nested] 44+ messages in thread
* Re: [PATCH v1] PM: sleep: Do not flag runtime PM workqueue as freezable 2025-12-02 1:06 ` Bart Van Assche @ 2025-12-02 11:53 ` Rafael J. Wysocki 2025-12-02 13:29 ` Rafael J. Wysocki 0 siblings, 1 reply; 44+ messages in thread From: Rafael J. Wysocki @ 2025-12-02 11:53 UTC (permalink / raw) To: Bart Van Assche Cc: Rafael J. Wysocki, YangYang, Jens Axboe, Greg Kroah-Hartman, Danilo Krummrich, linux-block, linux-kernel, linux-pm, Ulf Hansson On Tue, Dec 2, 2025 at 2:06 AM Bart Van Assche <bvanassche@acm.org> wrote: > > On 12/1/25 11:58 AM, Rafael J. Wysocki wrote: > > So I've been testing the patch below for a few days and it will eliminate > > the latter, but even after this patch runtime PM will be disabled in > > device_suspend_late() and if the problem you are facing is still there > > after this patch, it will need to dealt with at the driver level. > > > > Generally speaking, driver involvement is needed to make runtime PM and > > system suspend/resume work together in the majority of cases. > > Thank you for having developed and shared this patch. Is the following > quote from the Linux kernel documentation still correct with this patch > applied or should an update for Documentation/power/runtime_pm.rst > perhaps be included in this patch? > > "The power management workqueue pm_wq in which bus types and device > drivers can > put their PM-related work items. It is strongly recommended that > pm_wq be > used for queuing all work items related to runtime PM, because this > allows > them to be synchronized with system-wide power transitions (suspend > to RAM, > hibernation and resume from system sleep states). pm_wq is declared in > include/linux/pm_runtime.h and defined in kernel/power/main.c." It doesn't say what the synchronization mechanism is in particular and some synchronization is still provided after this patch, via the pm_runtime_barrier() in device_suspend(), for example. ^ permalink raw reply [flat|nested] 44+ messages in thread
* Re: [PATCH v1] PM: sleep: Do not flag runtime PM workqueue as freezable 2025-12-02 11:53 ` Rafael J. Wysocki @ 2025-12-02 13:29 ` Rafael J. Wysocki 0 siblings, 0 replies; 44+ messages in thread From: Rafael J. Wysocki @ 2025-12-02 13:29 UTC (permalink / raw) To: Bart Van Assche Cc: YangYang, Jens Axboe, Greg Kroah-Hartman, Danilo Krummrich, linux-block, linux-kernel, linux-pm, Ulf Hansson On Tue, Dec 2, 2025 at 12:53 PM Rafael J. Wysocki <rafael@kernel.org> wrote: > > On Tue, Dec 2, 2025 at 2:06 AM Bart Van Assche <bvanassche@acm.org> wrote: > > > > On 12/1/25 11:58 AM, Rafael J. Wysocki wrote: > > > So I've been testing the patch below for a few days and it will eliminate > > > the latter, but even after this patch runtime PM will be disabled in > > > device_suspend_late() and if the problem you are facing is still there > > > after this patch, it will need to dealt with at the driver level. > > > > > > Generally speaking, driver involvement is needed to make runtime PM and > > > system suspend/resume work together in the majority of cases. > > > > Thank you for having developed and shared this patch. Is the following > > quote from the Linux kernel documentation still correct with this patch > > applied or should an update for Documentation/power/runtime_pm.rst > > perhaps be included in this patch? > > > > "The power management workqueue pm_wq in which bus types and device > > drivers can > > put their PM-related work items. It is strongly recommended that > > pm_wq be > > used for queuing all work items related to runtime PM, because this > > allows > > them to be synchronized with system-wide power transitions (suspend > > to RAM, > > hibernation and resume from system sleep states). pm_wq is declared in > > include/linux/pm_runtime.h and defined in kernel/power/main.c." > > It doesn't say what the synchronization mechanism is in particular and > some synchronization is still provided after this patch, via the > pm_runtime_barrier() in device_suspend(), for example. Though there is another piece of documentation that needs updating to reflect the changes in this patch, so I'll send a v2 at one point. ^ permalink raw reply [flat|nested] 44+ messages in thread
* Re: [PATCH v1] PM: sleep: Do not flag runtime PM workqueue as freezable 2025-12-01 19:58 ` [PATCH v1] PM: sleep: Do not flag runtime PM workqueue as freezable Rafael J. Wysocki 2025-12-02 1:06 ` Bart Van Assche @ 2025-12-02 10:36 ` YangYang 2025-12-02 14:58 ` Ulf Hansson 2 siblings, 0 replies; 44+ messages in thread From: YangYang @ 2025-12-02 10:36 UTC (permalink / raw) To: Rafael J. Wysocki Cc: Bart Van Assche, Jens Axboe, Greg Kroah-Hartman, Danilo Krummrich, linux-block, linux-kernel, linux-pm, Ulf Hansson On 2025/12/2 3:58, Rafael J. Wysocki wrote: > On Monday, December 1, 2025 7:47:46 PM CET Rafael J. Wysocki wrote: >> On Mon, Dec 1, 2025 at 10:46 AM YangYang <yang.yang@vivo.com> wrote: > > [cut] > >> If blk_queue_enter() or __bio_queue_enter() is allowed to race with >> disabling runtime PM for q->dev, failure to resume q->dev is alway >> possible and there are no changes that can be made to >> pm_runtime_disable() to prevent that from happening. If >> __pm_runtime_disable() wins the race, it will increment >> power.disable_depth and rpm_resume() will bail out when it sees that >> no matter what. >> >> You should not conflate "runtime PM doesn't work when it is disabled" >> with "asynchronous runtime PM doesn't work after freezing the PM >> workqueue". They are both true, but they are not the same. > > So I've been testing the patch below for a few days and it will eliminate > the latter, but even after this patch runtime PM will be disabled in > device_suspend_late() and if the problem you are facing is still there > after this patch, it will need to dealt with at the driver level. > > Generally speaking, driver involvement is needed to make runtime PM and > system suspend/resume work together in the majority of cases. > Thank you. I'll perform some tests with this patch applied. ^ permalink raw reply [flat|nested] 44+ messages in thread
* Re: [PATCH v1] PM: sleep: Do not flag runtime PM workqueue as freezable 2025-12-01 19:58 ` [PATCH v1] PM: sleep: Do not flag runtime PM workqueue as freezable Rafael J. Wysocki 2025-12-02 1:06 ` Bart Van Assche 2025-12-02 10:36 ` YangYang @ 2025-12-02 14:58 ` Ulf Hansson 2 siblings, 0 replies; 44+ messages in thread From: Ulf Hansson @ 2025-12-02 14:58 UTC (permalink / raw) To: Rafael J. Wysocki Cc: YangYang, Bart Van Assche, Jens Axboe, Greg Kroah-Hartman, Danilo Krummrich, linux-block, linux-kernel, linux-pm On Mon, 1 Dec 2025 at 20:58, Rafael J. Wysocki <rafael@kernel.org> wrote: > > On Monday, December 1, 2025 7:47:46 PM CET Rafael J. Wysocki wrote: > > On Mon, Dec 1, 2025 at 10:46 AM YangYang <yang.yang@vivo.com> wrote: > > [cut] > > > If blk_queue_enter() or __bio_queue_enter() is allowed to race with > > disabling runtime PM for q->dev, failure to resume q->dev is alway > > possible and there are no changes that can be made to > > pm_runtime_disable() to prevent that from happening. If > > __pm_runtime_disable() wins the race, it will increment > > power.disable_depth and rpm_resume() will bail out when it sees that > > no matter what. > > > > You should not conflate "runtime PM doesn't work when it is disabled" > > with "asynchronous runtime PM doesn't work after freezing the PM > > workqueue". They are both true, but they are not the same. > > So I've been testing the patch below for a few days and it will eliminate > the latter, but even after this patch runtime PM will be disabled in > device_suspend_late() and if the problem you are facing is still there > after this patch, it will need to dealt with at the driver level. > > Generally speaking, driver involvement is needed to make runtime PM and > system suspend/resume work together in the majority of cases. > > --- > From: Rafael J. Wysocki <rafael.j.wysocki@intel.com> > Subject: > > Till now, the runtime PM workqueue has been flagged as freezable, so it > does not process work items during system-wide PM transitions like > system suspend and resume. The original reason to do that was to > reduce the likelihood of runtime PM getting in the way of system-wide > PM processing, but now it is mostly an optimization because (1) runtime > suspend of devices is prevented by bumping up their runtime PM usage > counters in device_prepare() and (2) device drivers are expected to > disable runtime PM for the devices handled by them before they embark > on system-wide PM activities that may change the state of the hardware > or otherwise interfere with runtime PM. However, it prevents > asynchronous runtime resume of devices from working during system-wide > PM transitions, which is confusing because synchronous runtime resume > is not prevented at the same time, and it also sometimes turns out to > be problematic. > > For example, it has been reported that blk_queue_enter() may deadlock > during a system suspend transition because of the pm_request_resume() > usage in it [1]. That happens because the asynchronous runtime resume > of the given device is not processed due to the freezing of the runtime > PM workqueue. While it may be better to address this particular issue > in the block layer, the very presence of it means that similar problems > may be expected to occur elsewhere. > > For this reason, remove the WQ_FREEZABLE flag from the runtime PM > workqueue and make device_suspend_late() use the generic variant of > pm_runtime_disable() that will carry out runtime PM of the device > synchronously if there is pending resume work for it. > > Also update the comment before the pm_runtime_disable() call in > device_suspend_late() to document the fact that the runtime PM > should not be expected to work for the device until the end of > device_resume_early(). > > This change may, even though it is not expected to, uncover some > latent issues related to queuing up asynchronous runtime resume > work items during system suspend or hibernation. However, they > should be limited to the interference between runtime resume and > system-wide PM callbacks in the cases when device drivers start > to handle system-wide PM before disabling runtime PM as described > above. > > Link: https://lore.kernel.org/linux-pm/20251126101636.205505-2-yang.yang@vivo.com/ > Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> I agree with the above and this seems like a reasonable change to me. Yep, it's not entirely easy to know whether all users of pm_request_resume() (and similar) are fine with this too, but in general I think they should. So, feel free to add: Reviewed-by: Ulf Hansson <ulf.hansson@linaro.org> Kind regards Uffe > --- > drivers/base/power/main.c | 7 ++++--- > kernel/power/main.c | 2 +- > 2 files changed, 5 insertions(+), 4 deletions(-) > > --- a/drivers/base/power/main.c > +++ b/drivers/base/power/main.c > @@ -1647,10 +1647,11 @@ static void device_suspend_late(struct d > goto Complete; > > /* > - * Disable runtime PM for the device without checking if there is a > - * pending resume request for it. > + * After this point, any runtime PM operations targeting the device > + * will fail until the corresponding pm_runtime_enable() call in > + * device_resume_early(). > */ > - __pm_runtime_disable(dev, false); > + pm_runtime_disable(dev); > > if (dev->power.syscore) > goto Skip; > --- a/kernel/power/main.c > +++ b/kernel/power/main.c > @@ -1125,7 +1125,7 @@ EXPORT_SYMBOL_GPL(pm_wq); > > static int __init pm_start_workqueues(void) > { > - pm_wq = alloc_workqueue("pm", WQ_FREEZABLE | WQ_UNBOUND, 0); > + pm_wq = alloc_workqueue("pm", WQ_UNBOUND, 0); > if (!pm_wq) > return -ENOMEM; > > > > ^ permalink raw reply [flat|nested] 44+ messages in thread
* Re: [PATCH 1/2] PM: runtime: Fix I/O hang due to race between resume and runtime disable 2025-12-01 18:47 ` Rafael J. Wysocki 2025-12-01 19:58 ` [PATCH v1] PM: sleep: Do not flag runtime PM workqueue as freezable Rafael J. Wysocki @ 2025-12-02 0:40 ` Bart Van Assche 2025-12-02 12:14 ` Rafael J. Wysocki 2025-12-05 15:24 ` [PATCH v2] PM: sleep: Do not flag runtime PM workqueue as freezable Rafael J. Wysocki 2 siblings, 1 reply; 44+ messages in thread From: Bart Van Assche @ 2025-12-02 0:40 UTC (permalink / raw) To: Rafael J. Wysocki, YangYang Cc: Jens Axboe, Pavel Machek, Len Brown, Greg Kroah-Hartman, Danilo Krummrich, linux-block, linux-kernel, linux-pm On 12/1/25 10:47 AM, Rafael J. Wysocki wrote: > Generally speaking, if blk_queue_enter() or __bio_queue_enter() may > run in parallel with device_suspend_late() for q->dev, the driver of > that device is defective, because it is responsible for preventing > this situation from happening. The most straightforward way to > achieve that is to provide a .suspend() callback for q->dev that will > runtime-resume it (and, of course, q->dev will need to be prepared for > system suspend as appropriate after that). Isn't the suspend / hibernation order such that no block I/O is submitted while block devices transition to a lower power state? I'm surprised to read that individual drivers are responsible for preventing that blk_queue_enter() or __bio_queue_enter() run concurrently with device_suspend_late(). Regarding the UFSHCI driver: if a UFS controller is already runtime suspended, we want it to remain suspended during system suspend. Thanks, Bart. ^ permalink raw reply [flat|nested] 44+ messages in thread
* Re: [PATCH 1/2] PM: runtime: Fix I/O hang due to race between resume and runtime disable 2025-12-02 0:40 ` [PATCH 1/2] PM: runtime: Fix I/O hang due to race between resume and runtime disable Bart Van Assche @ 2025-12-02 12:14 ` Rafael J. Wysocki 2025-12-02 13:37 ` Rafael J. Wysocki 0 siblings, 1 reply; 44+ messages in thread From: Rafael J. Wysocki @ 2025-12-02 12:14 UTC (permalink / raw) To: Bart Van Assche Cc: Rafael J. Wysocki, YangYang, Jens Axboe, Pavel Machek, Len Brown, Greg Kroah-Hartman, Danilo Krummrich, linux-block, linux-kernel, linux-pm On Tue, Dec 2, 2025 at 1:41 AM Bart Van Assche <bvanassche@acm.org> wrote: > > On 12/1/25 10:47 AM, Rafael J. Wysocki wrote: > > Generally speaking, if blk_queue_enter() or __bio_queue_enter() may > > run in parallel with device_suspend_late() for q->dev, the driver of > > that device is defective, because it is responsible for preventing > > this situation from happening. The most straightforward way to > > achieve that is to provide a .suspend() callback for q->dev that will > > runtime-resume it (and, of course, q->dev will need to be prepared for > > system suspend as appropriate after that). > > Isn't the suspend / hibernation order such that no block I/O is > submitted while block devices transition to a lower power state? I'm > surprised to read that individual drivers are responsible for preventing > that blk_queue_enter() or __bio_queue_enter() run concurrently with > device_suspend_late(). To be more precise, they don't need to be prevented from running concurrently with device_suspend_late() in general. The driver needs to ensure though that q->dev is not runtime-suspended in device_suspend_late() if blk_queue_enter() or __bio_queue_enter() are expected to run in parallel with it or later. > Regarding the UFSHCI driver: if a UFS controller is already runtime > suspended, we want it to remain suspended during system suspend. That can be done, but still the driver is responsible for preparing the device for system suspend. The most popular strategy is to use pm_runtime_force_suspend/resume() as driver suspend callbacks for the device, either as .suspend()/.resume() or as .suspend_late()/resume_early(), respectively. In both cases, runtime PM will be disabled and runtime PM callbacks will be used for stopping the device - or not, if it is suspended already - but after that it must not be accessed in any way until the resume part runs. ^ permalink raw reply [flat|nested] 44+ messages in thread
* Re: [PATCH 1/2] PM: runtime: Fix I/O hang due to race between resume and runtime disable 2025-12-02 12:14 ` Rafael J. Wysocki @ 2025-12-02 13:37 ` Rafael J. Wysocki 0 siblings, 0 replies; 44+ messages in thread From: Rafael J. Wysocki @ 2025-12-02 13:37 UTC (permalink / raw) To: Bart Van Assche Cc: YangYang, Jens Axboe, Pavel Machek, Len Brown, Greg Kroah-Hartman, Danilo Krummrich, linux-block, linux-kernel, linux-pm On Tue, Dec 2, 2025 at 1:14 PM Rafael J. Wysocki <rafael@kernel.org> wrote: > > On Tue, Dec 2, 2025 at 1:41 AM Bart Van Assche <bvanassche@acm.org> wrote: > > > > On 12/1/25 10:47 AM, Rafael J. Wysocki wrote: > > > Generally speaking, if blk_queue_enter() or __bio_queue_enter() may > > > run in parallel with device_suspend_late() for q->dev, the driver of > > > that device is defective, because it is responsible for preventing > > > this situation from happening. The most straightforward way to > > > achieve that is to provide a .suspend() callback for q->dev that will > > > runtime-resume it (and, of course, q->dev will need to be prepared for > > > system suspend as appropriate after that). > > > > Isn't the suspend / hibernation order such that no block I/O is > > submitted while block devices transition to a lower power state? I'm > > surprised to read that individual drivers are responsible for preventing > > that blk_queue_enter() or __bio_queue_enter() run concurrently with > > device_suspend_late(). > > To be more precise, they don't need to be prevented from running > concurrently with device_suspend_late() in general. The driver needs > to ensure though that q->dev is not runtime-suspended in > device_suspend_late() if blk_queue_enter() or __bio_queue_enter() are > expected to run in parallel with it or later. > > > Regarding the UFSHCI driver: if a UFS controller is already runtime > > suspended, we want it to remain suspended during system suspend. > > That can be done, but still the driver is responsible for preparing > the device for system suspend. > > The most popular strategy is to use pm_runtime_force_suspend/resume() > as driver suspend callbacks for the device, either as > .suspend()/.resume() or as .suspend_late()/resume_early(), > respectively. In both cases, runtime PM will be disabled and runtime > PM callbacks will be used for stopping the device - or not, if it is > suspended already - but after that it must not be accessed in any way > until the resume part runs. One more thing that needs to be said here: The PM core expects the decision on whether or not to leave a runtime-suspended device in suspend across system-wide suspend-resume to be made before device_suspend_late() is called for that device. If the device is suspended at that point, the expectation is that it will be left in suspend. Otherwise, the expectation is that it will be taken care of by the .suspend_late() and .suspend_noirq() callbacks (and this goes beyond runtime PM, quite obviously). ^ permalink raw reply [flat|nested] 44+ messages in thread
* [PATCH v2] PM: sleep: Do not flag runtime PM workqueue as freezable 2025-12-01 18:47 ` Rafael J. Wysocki 2025-12-01 19:58 ` [PATCH v1] PM: sleep: Do not flag runtime PM workqueue as freezable Rafael J. Wysocki 2025-12-02 0:40 ` [PATCH 1/2] PM: runtime: Fix I/O hang due to race between resume and runtime disable Bart Van Assche @ 2025-12-05 15:24 ` Rafael J. Wysocki 2025-12-05 19:10 ` Bart Van Assche 2 siblings, 1 reply; 44+ messages in thread From: Rafael J. Wysocki @ 2025-12-05 15:24 UTC (permalink / raw) To: linux-pm Cc: YangYang, Bart Van Assche, Jens Axboe, linux-block, linux-kernel, Ulf Hansson From: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Till now, the runtime PM workqueue has been flagged as freezable, so it does not process work items during system-wide PM transitions like system suspend and resume. The original reason to do that was to reduce the likelihood of runtime PM getting in the way of system-wide PM processing, but now it is mostly an optimization because (1) runtime suspend of devices is prevented by bumping up their runtime PM usage counters in device_prepare() and (2) device drivers are expected to disable runtime PM for the devices handled by them before they embark on system-wide PM activities that may change the state of the hardware or otherwise interfere with runtime PM. However, it prevents asynchronous runtime resume of devices from working during system-wide PM transitions, which is confusing because synchronous runtime resume is not prevented at the same time, and it also sometimes turns out to be problematic. For example, it has been reported that blk_queue_enter() may deadlock during a system suspend transition because of the pm_request_resume() usage in it [1]. That happens because the asynchronous runtime resume of the given device is not processed due to the freezing of the runtime PM workqueue. While it may be better to address this particular issue in the block layer, the very presence of it means that similar problems may be expected to occur elsewhere. For this reason, remove the WQ_FREEZABLE flag from the runtime PM workqueue and make device_suspend_late() use the generic variant of pm_runtime_disable() that will carry out runtime PM of the device synchronously if there is pending resume work for it. Also update the comment before the pm_runtime_disable() call in device_suspend_late(), to document the fact that the runtime PM should not be expected to work for the device until the end of device_resume_early(), and update the related documentation. This change may, even though it is not expected to, uncover some latent issues related to queuing up asynchronous runtime resume work items during system suspend or hibernation. However, they should be limited to the interference between runtime resume and system-wide PM callbacks in the cases when device drivers start to handle system-wide PM before disabling runtime PM as described above. Link: https://lore.kernel.org/linux-pm/20251126101636.205505-2-yang.yang@vivo.com/ Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Reviewed-by: Ulf Hansson <ulf.hansson@linaro.org> --- v1 -> v2: * Update documentation in runtime_pm.rst. * Add R-by from Ulf. --- Documentation/power/runtime_pm.rst | 7 +++---- drivers/base/power/main.c | 7 ++++--- kernel/power/main.c | 2 +- 3 files changed, 8 insertions(+), 8 deletions(-) --- a/Documentation/power/runtime_pm.rst +++ b/Documentation/power/runtime_pm.rst @@ -714,10 +714,9 @@ out the following operations: * During system suspend pm_runtime_get_noresume() is called for every device right before executing the subsystem-level .prepare() callback for it and pm_runtime_barrier() is called for every device right before executing the - subsystem-level .suspend() callback for it. In addition to that the PM core - calls __pm_runtime_disable() with 'false' as the second argument for every - device right before executing the subsystem-level .suspend_late() callback - for it. + subsystem-level .suspend() callback for it. In addition to that, the PM + core disables runtime PM for every device right before executing the + subsystem-level .suspend_late() callback for it. * During system resume pm_runtime_enable() and pm_runtime_put() are called for every device right after executing the subsystem-level .resume_early() --- a/drivers/base/power/main.c +++ b/drivers/base/power/main.c @@ -1647,10 +1647,11 @@ static void device_suspend_late(struct d goto Complete; /* - * Disable runtime PM for the device without checking if there is a - * pending resume request for it. + * After this point, any runtime PM operations targeting the device + * will fail until the corresponding pm_runtime_enable() call in + * device_resume_early(). */ - __pm_runtime_disable(dev, false); + pm_runtime_disable(dev); if (dev->power.syscore) goto Skip; --- a/kernel/power/main.c +++ b/kernel/power/main.c @@ -1125,7 +1125,7 @@ EXPORT_SYMBOL_GPL(pm_wq); static int __init pm_start_workqueues(void) { - pm_wq = alloc_workqueue("pm", WQ_FREEZABLE | WQ_UNBOUND, 0); + pm_wq = alloc_workqueue("pm", WQ_UNBOUND, 0); if (!pm_wq) return -ENOMEM; ^ permalink raw reply [flat|nested] 44+ messages in thread
* Re: [PATCH v2] PM: sleep: Do not flag runtime PM workqueue as freezable 2025-12-05 15:24 ` [PATCH v2] PM: sleep: Do not flag runtime PM workqueue as freezable Rafael J. Wysocki @ 2025-12-05 19:10 ` Bart Van Assche 2025-12-07 11:23 ` Rafael J. Wysocki 0 siblings, 1 reply; 44+ messages in thread From: Bart Van Assche @ 2025-12-05 19:10 UTC (permalink / raw) To: Rafael J. Wysocki, linux-pm Cc: YangYang, Jens Axboe, linux-block, linux-kernel, Ulf Hansson On 12/5/25 5:24 AM, Rafael J. Wysocki wrote: > For example, it has been reported that blk_queue_enter() may deadlock > during a system suspend transition because of the pm_request_resume() > usage in it [1]. System resume is also affected. If pm_request_resume() is called before the device it applies to is resumed by the system resume code then the pm_request_resume() call also hangs. Otherwise this patch looks good to me. Thanks, Bart. ^ permalink raw reply [flat|nested] 44+ messages in thread
* Re: [PATCH v2] PM: sleep: Do not flag runtime PM workqueue as freezable 2025-12-05 19:10 ` Bart Van Assche @ 2025-12-07 11:23 ` Rafael J. Wysocki 0 siblings, 0 replies; 44+ messages in thread From: Rafael J. Wysocki @ 2025-12-07 11:23 UTC (permalink / raw) To: Bart Van Assche Cc: Rafael J. Wysocki, linux-pm, YangYang, Jens Axboe, linux-block, linux-kernel, Ulf Hansson On Fri, Dec 5, 2025 at 8:11 PM Bart Van Assche <bvanassche@acm.org> wrote: > > On 12/5/25 5:24 AM, Rafael J. Wysocki wrote: > > For example, it has been reported that blk_queue_enter() may deadlock > > during a system suspend transition because of the pm_request_resume() > > usage in it [1]. > > System resume is also affected. If pm_request_resume() is called before > the device it applies to is resumed by the system resume code then the > pm_request_resume() call also hangs. Rather, the work item queued by it will not make progress. OK, I'll add this information to the patch changelog while applying it. > Otherwise this patch looks good to me. Thank you! ^ permalink raw reply [flat|nested] 44+ messages in thread
* [PATCH 2/2] blk-mq: Fix I/O hang caused by incomplete device resume 2025-11-26 10:16 [PATCH 0/2] PM: runtime: Fix potential I/O hang Yang Yang 2025-11-26 10:16 ` [PATCH 1/2] PM: runtime: Fix I/O hang due to race between resume and runtime disable Yang Yang @ 2025-11-26 10:16 ` Yang Yang 2025-11-26 11:31 ` [PATCH 0/2] PM: runtime: Fix potential I/O hang Rafael J. Wysocki 2 siblings, 0 replies; 44+ messages in thread From: Yang Yang @ 2025-11-26 10:16 UTC (permalink / raw) To: Jens Axboe, Rafael J. Wysocki, Pavel Machek, Len Brown, Greg Kroah-Hartman, Danilo Krummrich, linux-block, linux-kernel, linux-pm Cc: Yang Yang Setting the force_check_resume flag ensures the device is resumed properly. Signed-off-by: Yang Yang <yang.yang@vivo.com> --- block/blk-pm.c | 1 + 1 file changed, 1 insertion(+) diff --git a/block/blk-pm.c b/block/blk-pm.c index 8d3e052f91da..d23918fbd59f 100644 --- a/block/blk-pm.c +++ b/block/blk-pm.c @@ -28,6 +28,7 @@ */ void blk_pm_runtime_init(struct request_queue *q, struct device *dev) { + dev->power.force_check_resume = true; q->dev = dev; q->rpm_status = RPM_ACTIVE; pm_runtime_set_autosuspend_delay(q->dev, -1); -- 2.34.1 ^ permalink raw reply related [flat|nested] 44+ messages in thread
* Re: [PATCH 0/2] PM: runtime: Fix potential I/O hang 2025-11-26 10:16 [PATCH 0/2] PM: runtime: Fix potential I/O hang Yang Yang 2025-11-26 10:16 ` [PATCH 1/2] PM: runtime: Fix I/O hang due to race between resume and runtime disable Yang Yang 2025-11-26 10:16 ` [PATCH 2/2] blk-mq: Fix I/O hang caused by incomplete device resume Yang Yang @ 2025-11-26 11:31 ` Rafael J. Wysocki 2025-11-26 15:48 ` Bart Van Assche 2 siblings, 1 reply; 44+ messages in thread From: Rafael J. Wysocki @ 2025-11-26 11:31 UTC (permalink / raw) To: Yang Yang Cc: Jens Axboe, Rafael J. Wysocki, Pavel Machek, Len Brown, Greg Kroah-Hartman, Danilo Krummrich, linux-block, linux-kernel, linux-pm On Wed, Nov 26, 2025 at 11:17 AM Yang Yang <yang.yang@vivo.com> wrote: > > > Yang Yang (2): > PM: runtime: Fix I/O hang due to race between resume and runtime > disable > blk-mq: Fix I/O hang caused by incomplete device resume This is a no-go as far as I'm concerned. Please address the issue differently. ^ permalink raw reply [flat|nested] 44+ messages in thread
* Re: [PATCH 0/2] PM: runtime: Fix potential I/O hang 2025-11-26 11:31 ` [PATCH 0/2] PM: runtime: Fix potential I/O hang Rafael J. Wysocki @ 2025-11-26 15:48 ` Bart Van Assche 2025-11-26 16:59 ` Rafael J. Wysocki 0 siblings, 1 reply; 44+ messages in thread From: Bart Van Assche @ 2025-11-26 15:48 UTC (permalink / raw) To: Rafael J. Wysocki, Yang Yang Cc: Jens Axboe, Pavel Machek, Len Brown, Greg Kroah-Hartman, Danilo Krummrich, linux-block, linux-kernel, linux-pm On 11/26/25 3:31 AM, Rafael J. Wysocki wrote: > Please address the issue differently. It seems unfortunate to me that __pm_runtime_barrier() can cause pm_request_resume() to hang. Would it be safe to remove the cancel_work_sync() call from __pm_runtime_barrier() since pm_runtime_work() calls functions that check disable_depth when processing RPM_REQ_SUSPEND and RPM_REQ_AUTOSUSPEND? Would this be sufficient to fix the reported deadlock? Thanks, Bart. ^ permalink raw reply [flat|nested] 44+ messages in thread
* Re: [PATCH 0/2] PM: runtime: Fix potential I/O hang 2025-11-26 15:48 ` Bart Van Assche @ 2025-11-26 16:59 ` Rafael J. Wysocki 2025-11-26 17:21 ` Rafael J. Wysocki 0 siblings, 1 reply; 44+ messages in thread From: Rafael J. Wysocki @ 2025-11-26 16:59 UTC (permalink / raw) To: Bart Van Assche Cc: Rafael J. Wysocki, Yang Yang, Jens Axboe, Pavel Machek, Len Brown, Greg Kroah-Hartman, Danilo Krummrich, linux-block, linux-kernel, linux-pm On Wed, Nov 26, 2025 at 4:48 PM Bart Van Assche <bvanassche@acm.org> wrote: > > On 11/26/25 3:31 AM, Rafael J. Wysocki wrote: > > Please address the issue differently. > > It seems unfortunate to me that __pm_runtime_barrier() can cause pm_request_resume() to hang. I wouldn't call it a hang. __pm_runtime_barrier() removes the work item queued by pm_request_resume(), but at the time when it is called, which is device_suspend_late(), the work item queued by pm_request_resume() cannot make progress anyway. It will only be able to make progress when the PM workqueue is unfrozen at the end of the system resume transition. > Would it be safe to remove the > cancel_work_sync() call from __pm_runtime_barrier() since > pm_runtime_work() calls functions that check disable_depth > when processing RPM_REQ_SUSPEND and RPM_REQ_AUTOSUSPEND? Would > this be sufficient to fix the reported deadlock? If you want the resume work item to survive the system suspend/resume cycle, __pm_runtime_disable() may be changed to make that happen, but this still will not allow the work to make progress until the system resume ends. I'm not sure if this would help to address the issue at hand though. ^ permalink raw reply [flat|nested] 44+ messages in thread
* Re: [PATCH 0/2] PM: runtime: Fix potential I/O hang 2025-11-26 16:59 ` Rafael J. Wysocki @ 2025-11-26 17:21 ` Rafael J. Wysocki 2025-11-26 17:34 ` Rafael J. Wysocki 0 siblings, 1 reply; 44+ messages in thread From: Rafael J. Wysocki @ 2025-11-26 17:21 UTC (permalink / raw) To: Bart Van Assche Cc: Yang Yang, Jens Axboe, Pavel Machek, Len Brown, Greg Kroah-Hartman, Danilo Krummrich, linux-block, linux-kernel, linux-pm On Wed, Nov 26, 2025 at 5:59 PM Rafael J. Wysocki <rafael@kernel.org> wrote: > > On Wed, Nov 26, 2025 at 4:48 PM Bart Van Assche <bvanassche@acm.org> wrote: > > > > On 11/26/25 3:31 AM, Rafael J. Wysocki wrote: > > > Please address the issue differently. > > > > It seems unfortunate to me that __pm_runtime_barrier() can cause pm_request_resume() to hang. > > I wouldn't call it a hang. > > __pm_runtime_barrier() removes the work item queued by > pm_request_resume(), but at the time when it is called, which is > device_suspend_late(), the work item queued by pm_request_resume() > cannot make progress anyway. It will only be able to make progress > when the PM workqueue is unfrozen at the end of the system resume > transition. > > > Would it be safe to remove the > > cancel_work_sync() call from __pm_runtime_barrier() since > > pm_runtime_work() calls functions that check disable_depth > > when processing RPM_REQ_SUSPEND and RPM_REQ_AUTOSUSPEND? Would > > this be sufficient to fix the reported deadlock? > > If you want the resume work item to survive the system suspend/resume > cycle, __pm_runtime_disable() may be changed to make that happen, but > this still will not allow the work to make progress until the system > resume ends. > > I'm not sure if this would help to address the issue at hand though. I actually have a better idea: Why don't we resume all devices that have runtime resume work items pending at the time when device_suspend() is called? Arguably, somebody wanted them to runtime-resume, so they should be resumed before being prepared for system suspend and that will eliminate the issue at hand (because devices cannot suspend during system suspend/resume). ^ permalink raw reply [flat|nested] 44+ messages in thread
* Re: [PATCH 0/2] PM: runtime: Fix potential I/O hang 2025-11-26 17:21 ` Rafael J. Wysocki @ 2025-11-26 17:34 ` Rafael J. Wysocki 0 siblings, 0 replies; 44+ messages in thread From: Rafael J. Wysocki @ 2025-11-26 17:34 UTC (permalink / raw) To: Bart Van Assche, Yang Yang Cc: Jens Axboe, Pavel Machek, Len Brown, Greg Kroah-Hartman, Danilo Krummrich, linux-block, linux-kernel, linux-pm On Wed, Nov 26, 2025 at 6:21 PM Rafael J. Wysocki <rafael@kernel.org> wrote: > > On Wed, Nov 26, 2025 at 5:59 PM Rafael J. Wysocki <rafael@kernel.org> wrote: > > > > On Wed, Nov 26, 2025 at 4:48 PM Bart Van Assche <bvanassche@acm.org> wrote: > > > > > > On 11/26/25 3:31 AM, Rafael J. Wysocki wrote: > > > > Please address the issue differently. > > > > > > It seems unfortunate to me that __pm_runtime_barrier() can cause pm_request_resume() to hang. > > > > I wouldn't call it a hang. > > > > __pm_runtime_barrier() removes the work item queued by > > pm_request_resume(), but at the time when it is called, which is > > device_suspend_late(), the work item queued by pm_request_resume() > > cannot make progress anyway. It will only be able to make progress > > when the PM workqueue is unfrozen at the end of the system resume > > transition. > > > > > Would it be safe to remove the > > > cancel_work_sync() call from __pm_runtime_barrier() since > > > pm_runtime_work() calls functions that check disable_depth > > > when processing RPM_REQ_SUSPEND and RPM_REQ_AUTOSUSPEND? Would > > > this be sufficient to fix the reported deadlock? > > > > If you want the resume work item to survive the system suspend/resume > > cycle, __pm_runtime_disable() may be changed to make that happen, but > > this still will not allow the work to make progress until the system > > resume ends. > > > > I'm not sure if this would help to address the issue at hand though. > > I actually have a better idea: Why don't we resume all devices that > have runtime resume work items pending at the time when > device_suspend() is called? > > Arguably, somebody wanted them to runtime-resume, so they should be > resumed before being prepared for system suspend and that will > eliminate the issue at hand (because devices cannot suspend during > system suspend/resume). Wait, there is a pm_runtime_barrier() call in device_suspend() that does just that and additionally it calls __pm_runtime_barrier(), so all of the pending runtime PM work items should be cancelled by it. So it looks like the device in question is runtime-suspended at that point and only later blk_pm_resume_queue() is called to resume it. I'm wondering where it is called from. And maybe pm_runtime_resume() should be called for it from its ->suspend() callback? ^ permalink raw reply [flat|nested] 44+ messages in thread
end of thread, other threads:[~2025-12-07 11:23 UTC | newest] Thread overview: 44+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2025-11-26 10:16 [PATCH 0/2] PM: runtime: Fix potential I/O hang Yang Yang 2025-11-26 10:16 ` [PATCH 1/2] PM: runtime: Fix I/O hang due to race between resume and runtime disable Yang Yang 2025-11-26 11:30 ` Rafael J. Wysocki 2025-11-26 11:59 ` YangYang 2025-11-26 12:36 ` Rafael J. Wysocki 2025-11-26 15:33 ` Bart Van Assche 2025-11-26 15:41 ` Rafael J. Wysocki 2025-11-26 18:40 ` Bart Van Assche 2025-11-27 11:29 ` YangYang 2025-11-27 12:44 ` Rafael J. Wysocki 2025-11-28 7:20 ` YangYang 2025-12-01 16:40 ` Bart Van Assche 2025-11-26 18:06 ` Bart Van Assche 2025-11-26 19:16 ` Rafael J. Wysocki 2025-11-26 19:34 ` Rafael J. Wysocki 2025-11-26 20:17 ` Rafael J. Wysocki 2025-11-26 21:10 ` Bart Van Assche 2025-11-26 21:30 ` Rafael J. Wysocki 2025-11-26 22:47 ` Bart Van Assche 2025-11-27 12:34 ` Rafael J. Wysocki 2025-12-01 9:46 ` YangYang 2025-12-01 12:56 ` YangYang 2025-12-01 18:55 ` Rafael J. Wysocki 2025-12-02 10:33 ` YangYang 2025-12-02 12:18 ` Rafael J. Wysocki 2025-12-01 18:47 ` Rafael J. Wysocki 2025-12-01 19:58 ` [PATCH v1] PM: sleep: Do not flag runtime PM workqueue as freezable Rafael J. Wysocki 2025-12-02 1:06 ` Bart Van Assche 2025-12-02 11:53 ` Rafael J. Wysocki 2025-12-02 13:29 ` Rafael J. Wysocki 2025-12-02 10:36 ` YangYang 2025-12-02 14:58 ` Ulf Hansson 2025-12-02 0:40 ` [PATCH 1/2] PM: runtime: Fix I/O hang due to race between resume and runtime disable Bart Van Assche 2025-12-02 12:14 ` Rafael J. Wysocki 2025-12-02 13:37 ` Rafael J. Wysocki 2025-12-05 15:24 ` [PATCH v2] PM: sleep: Do not flag runtime PM workqueue as freezable Rafael J. Wysocki 2025-12-05 19:10 ` Bart Van Assche 2025-12-07 11:23 ` Rafael J. Wysocki 2025-11-26 10:16 ` [PATCH 2/2] blk-mq: Fix I/O hang caused by incomplete device resume Yang Yang 2025-11-26 11:31 ` [PATCH 0/2] PM: runtime: Fix potential I/O hang Rafael J. Wysocki 2025-11-26 15:48 ` Bart Van Assche 2025-11-26 16:59 ` Rafael J. Wysocki 2025-11-26 17:21 ` Rafael J. Wysocki 2025-11-26 17:34 ` Rafael J. Wysocki
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox